-
Planted: a dataset for planted forest identification from multi-satellite time series
Authors:
Luis Miguel Pazos-Outón,
Cristina Nader Vasconcelos,
Anton Raichuk,
Anurag Arnab,
Dan Morris,
Maxim Neumann
Abstract:
Protecting and restoring forest ecosystems is critical for biodiversity conservation and carbon sequestration. Forest monitoring on a global scale is essential for prioritizing and assessing conservation efforts. Satellite-based remote sensing is the only viable solution for providing global coverage, but to date, large-scale forest monitoring is limited to single modalities and single time points…
▽ More
Protecting and restoring forest ecosystems is critical for biodiversity conservation and carbon sequestration. Forest monitoring on a global scale is essential for prioritizing and assessing conservation efforts. Satellite-based remote sensing is the only viable solution for providing global coverage, but to date, large-scale forest monitoring is limited to single modalities and single time points. In this paper, we present a dataset consisting of data from five public satellites for recognizing forest plantations and planted tree species across the globe. Each satellite modality consists of a multi-year time series. The dataset, named \PlantD, includes over 2M examples of 64 tree label classes (46 genera and 40 species), distributed among 41 countries. This dataset is released to foster research in forest monitoring using multimodal, multi-scale, multi-temporal data sources. Additionally, we present initial baseline results and evaluate modality fusion and data augmentation approaches for this dataset.
△ Less
Submitted 24 May, 2024;
originally announced June 2024.
-
Public Computer Vision Datasets for Precision Livestock Farming: A Systematic Survey
Authors:
Anil Bhujel,
Yibin Wang,
Yuzhen Lu,
Daniel Morris,
Mukesh Dangol
Abstract:
Technology-driven precision livestock farming (PLF) empowers practitioners to monitor and analyze animal growth and health conditions for improved productivity and welfare. Computer vision (CV) is indispensable in PLF by using cameras and computer algorithms to supplement or supersede manual efforts for livestock data acquisition. Data availability is crucial for develo** innovative monitoring a…
▽ More
Technology-driven precision livestock farming (PLF) empowers practitioners to monitor and analyze animal growth and health conditions for improved productivity and welfare. Computer vision (CV) is indispensable in PLF by using cameras and computer algorithms to supplement or supersede manual efforts for livestock data acquisition. Data availability is crucial for develo** innovative monitoring and analysis systems through artificial intelligence-based techniques. However, data curation processes are tedious, time-consuming, and resource intensive. This study presents the first systematic survey of publicly available livestock CV datasets (https://github.com/Anil-Bhujel/Public-Computer-Vision-Dataset-A-Systematic-Survey). Among 58 public datasets identified and analyzed, encompassing different species of livestock, almost half of them are for cattle, followed by swine, poultry, and other animals. Individual animal detection and color imaging are the dominant application and imaging modality for livestock. The characteristics and baseline applications of the datasets are discussed, emphasizing the implications for animal welfare advocates. Challenges and opportunities are also discussed to inspire further efforts in develo** livestock CV datasets. This study highlights that the limited quantity of high-quality annotated datasets collected from diverse environments, animals, and applications, the absence of contextual metadata, are a real bottleneck in PLF.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Self-Annotated 3D Geometric Learning for Smeared Points Removal
Authors:
Miaowei Wang,
Daniel Morris
Abstract:
There has been significant progress in improving the accuracy and quality of consumer-level dense depth sensors. Nevertheless, there remains a common depth pixel artifact which we call smeared points. These are points not on any 3D surface and typically occur as interpolations between foreground and background objects. As they cause fictitious surfaces, these points have the potential to harm appl…
▽ More
There has been significant progress in improving the accuracy and quality of consumer-level dense depth sensors. Nevertheless, there remains a common depth pixel artifact which we call smeared points. These are points not on any 3D surface and typically occur as interpolations between foreground and background objects. As they cause fictitious surfaces, these points have the potential to harm applications dependent on the depth maps. Statistical outlier removal methods fare poorly in removing these points as they tend also to remove actual surface points. Trained network-based point removal faces difficulty in obtaining sufficient annotated data. To address this, we propose a fully self-annotated method to train a smeared point removal classifier. Our approach relies on gathering 3D geometric evidence from multiple perspectives to automatically detect and annotate smeared points and valid points. To validate the effectiveness of our method, we present a new benchmark dataset: the Real Azure-Kinect dataset. Experimental results and ablation studies show that our method outperforms traditional filters and other self-annotated methods. Our work is publicly available at https://github.com/wangmiaowei/wacv2024_smearedremover.git.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Absorbing Sets in Quantum LDPC Codes
Authors:
Kirsten D. Morris,
Tefjol Pllaha,
Christine A. Kelley
Abstract:
Iterative decoder failures of quantum low density parity check (QLDPC) codes are attributed to substructures in the code's graph, known as trap** sets, as well as degenerate errors that can arise in quantum codes. Failure inducing sets are subsets of codeword coordinates that, when initially in error, lead to decoding failure in a trap** set. The purpose of this paper is to examine failure ind…
▽ More
Iterative decoder failures of quantum low density parity check (QLDPC) codes are attributed to substructures in the code's graph, known as trap** sets, as well as degenerate errors that can arise in quantum codes. Failure inducing sets are subsets of codeword coordinates that, when initially in error, lead to decoding failure in a trap** set. The purpose of this paper is to examine failure inducing sets of QLDPC codes under syndrome-based iterative decoding. As for classical LDPC codes, we show that absorbing sets play a central role in understanding decoder failures. Raveendran and Vasic initiated the study of quantum trap** sets, where beyond the classical-type trap** sets, they identified rigid symmetric structures (a.k.a symmetric stabilizers) responsible for degenerate errors. In this paper, we show that this behavior is part of a much more general phenomenon that can be described by the absorbing set framework.
△ Less
Submitted 10 May, 2024; v1 submitted 26 July, 2023;
originally announced July 2023.
-
Label-Efficient Learning in Agriculture: A Comprehensive Review
Authors:
Jiajia Li,
Dong Chen,
Xinda Qi,
Zhaojian Li,
Yanbo Huang,
Daniel Morris,
Xiaobo Tan
Abstract:
The past decade has witnessed many great successes of machine learning (ML) and deep learning (DL) applications in agricultural systems, including weed control, plant disease diagnosis, agricultural robotics, and precision livestock management. Despite tremendous progresses, one downside of such ML/DL models is that they generally rely on large-scale labeled datasets for training, and the performa…
▽ More
The past decade has witnessed many great successes of machine learning (ML) and deep learning (DL) applications in agricultural systems, including weed control, plant disease diagnosis, agricultural robotics, and precision livestock management. Despite tremendous progresses, one downside of such ML/DL models is that they generally rely on large-scale labeled datasets for training, and the performance of such models is strongly influenced by the size and quality of available labeled data samples. In addition, collecting, processing, and labeling such large-scale datasets is extremely costly and time-consuming, partially due to the rising cost in human labor. Therefore, develo** label-efficient ML/DL methods for agricultural applications has received significant interests among researchers and practitioners. In fact, there are more than 50 papers on develo** and applying deep-learning-based label-efficient techniques to address various agricultural problems since 2016, which motivates the authors to provide a timely and comprehensive review of recent label-efficient ML/DL methods in agricultural applications. To this end, we first develop a principled taxonomy to organize these methods according to the degree of supervision, including weak supervision (i.e., active learning and semi-/weakly- supervised learning), and no supervision (i.e., un-/self- supervised learning), supplemented by representative state-of-the-art label-efficient ML/DL methods. In addition, a systematic review of various agricultural applications exploiting these label-efficient algorithms, such as precision agriculture, plant phenoty**, and postharvest quality assessment, is presented. Finally, we discuss the current problems and challenges, as well as future research directions. A well-classified paper list can be accessed at https://github.com/DongChen06/Label-efficient-in-Agriculture.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
TransCAR: Transformer-based Camera-And-Radar Fusion for 3D Object Detection
Authors:
Su Pang,
Daniel Morris,
Hayder Radha
Abstract:
Despite radar's popularity in the automotive industry, for fusion-based 3D object detection, most existing works focus on LiDAR and camera fusion. In this paper, we propose TransCAR, a Transformer-based Camera-And-Radar fusion solution for 3D object detection. Our TransCAR consists of two modules. The first module learns 2D features from surround-view camera images and then uses a sparse set of 3D…
▽ More
Despite radar's popularity in the automotive industry, for fusion-based 3D object detection, most existing works focus on LiDAR and camera fusion. In this paper, we propose TransCAR, a Transformer-based Camera-And-Radar fusion solution for 3D object detection. Our TransCAR consists of two modules. The first module learns 2D features from surround-view camera images and then uses a sparse set of 3D object queries to index into these 2D features. The vision-updated queries then interact with each other via transformer self-attention layer. The second module learns radar features from multiple radar scans and then applies transformer decoder to learn the interactions between radar features and vision-updated queries. The cross-attention layer within the transformer decoder can adaptively learn the soft-association between the radar features and vision-updated queries instead of hard-association based on sensor calibration only. Finally, our model estimates a bounding box per query using set-to-set Hungarian loss, which enables the method to avoid non-maximum suppression. TransCAR improves the velocity estimation using the radar scans without temporal information. The superior experimental results of our TransCAR on the challenging nuScenes datasets illustrate that our TransCAR outperforms state-of-the-art Camera-Radar fusion-based 3D object detection approaches.
△ Less
Submitted 30 April, 2023;
originally announced May 2023.
-
Feasibility of access EGI resources through the ESCAPE developed ESFRI Science Analysis Platform
Authors:
Giuliano Taffoni,
Sara Bertocco,
Dave Morris,
Manu Parra-Royón,
Klaas Kliffen,
Marco Molinaro,
John Swinbank,
Susana Sanchez Exposito
Abstract:
The EU ESCAPE project is develo** ESAP, ESFRI 1 Scientific Analysis Platform, as an API gateway that enables the seamless integration of independent services accessing distributed data and computing resources. In ESCAPE we are exploring the possibility of exploiting EGI's OpenStack cloud computing services through ESAP. In our contribution we briefly describe ESCAPE and ESAP, the the use cases,…
▽ More
The EU ESCAPE project is develo** ESAP, ESFRI 1 Scientific Analysis Platform, as an API gateway that enables the seamless integration of independent services accessing distributed data and computing resources. In ESCAPE we are exploring the possibility of exploiting EGI's OpenStack cloud computing services through ESAP. In our contribution we briefly describe ESCAPE and ESAP, the the use cases, the work done to automate a virtual machine creation in EGI's OpenStack cloud computing, drawbacks and possible solutions.
△ Less
Submitted 24 January, 2023;
originally announced January 2023.
-
Multi-modal Program Inference: a Marriage of Pre-trainedLanguage Models and Component-based Synthesis
Authors:
Kia Rahmani,
Mohammad Raza,
Sumit Gulwani,
Vu Le,
Daniel Morris,
Arjun Radhakrishna,
Gustavo Soares,
Ashish Tiwari
Abstract:
Multi-modal program synthesis refers to the task of synthesizing programs (code) from their specification given in different forms, such as a combination of natural language and examples. Examples provide a precise but incomplete specification, and natural language provides an ambiguous but more "complete" task description. Machine-learned pre-trained models (PTMs) are adept at handling ambiguous…
▽ More
Multi-modal program synthesis refers to the task of synthesizing programs (code) from their specification given in different forms, such as a combination of natural language and examples. Examples provide a precise but incomplete specification, and natural language provides an ambiguous but more "complete" task description. Machine-learned pre-trained models (PTMs) are adept at handling ambiguous natural language, but struggle with generating syntactically and semantically precise code. Program synthesis techniques can generate correct code, often even from incomplete but precise specifications, such as examples, but they are unable to work with the ambiguity of natural languages. We present an approach that combines PTMs with component-based synthesis (CBS): PTMs are used to generate candidates programs from the natural language description of the task, which are then used to guide the CBS procedure to find the program that matches the precise examples-based specification. We use our combination approach to instantiate multi-modal synthesis systems for two programming domains: the domain of regular expressions and the domain of CSS selectors. Our evaluation demonstrates the effectiveness of our domain-agnostic approach in comparison to a state-of-the-art specialized system, and the generality of our approach in providing multi-modal program synthesis from natural language and examples in different programming domains.
△ Less
Submitted 3 September, 2021;
originally announced September 2021.
-
Full-Velocity Radar Returns by Radar-Camera Fusion
Authors:
Yunfei Long,
Daniel Morris,
Xiaoming Liu,
Marcos Castro,
Punarjay Chakravarty,
Praveen Narayanan
Abstract:
A distinctive feature of Doppler radar is the measurement of velocity in the radial direction for radar points. However, the missing tangential velocity component hampers object velocity estimation as well as temporal integration of radar sweeps in dynamic scenes. Recognizing that fusing camera with radar provides complementary information to radar, in this paper we present a closed-form solution…
▽ More
A distinctive feature of Doppler radar is the measurement of velocity in the radial direction for radar points. However, the missing tangential velocity component hampers object velocity estimation as well as temporal integration of radar sweeps in dynamic scenes. Recognizing that fusing camera with radar provides complementary information to radar, in this paper we present a closed-form solution for the point-wise, full-velocity estimate of Doppler returns using the corresponding optical flow from camera images. Additionally, we address the association problem between radar returns and camera images with a neural network that is trained to estimate radar-camera correspondences. Experimental results on the nuScenes dataset verify the validity of the method and show significant improvements over the state-of-the-art in velocity estimation and accumulation of radar points.
△ Less
Submitted 24 August, 2021;
originally announced August 2021.
-
Sequen-C: A Multilevel Overview of Temporal Event Sequences
Authors:
Jessica Magallanes,
Tony Stone,
Paul D Morris,
Suzanne Mason,
Steven Wood,
Maria-Cruz Villa-Uriol
Abstract:
Building a visual overview of temporal event sequences with an optimal level-of-detail (i.e. simplified but informative) is an ongoing challenge - expecting the user to zoom into every important aspect of the overview can lead to missing insights. We propose a technique to build a multilevel overview of event sequences, whose granularity can be transformed across sequence clusters (vertical level-…
▽ More
Building a visual overview of temporal event sequences with an optimal level-of-detail (i.e. simplified but informative) is an ongoing challenge - expecting the user to zoom into every important aspect of the overview can lead to missing insights. We propose a technique to build a multilevel overview of event sequences, whose granularity can be transformed across sequence clusters (vertical level-of-detail) or longitudinally (horizontal level-of-detail), using hierarchical aggregation and a novel cluster data representation Align-Score-Simplify. By default, the overview shows an optimal number of sequence clusters obtained through the average silhouette width metric - then users are able to explore alternative optimal sequence clusterings. The vertical level-of-detail of the overview changes along with the number of clusters, whilst the horizontal level-of-detail refers to the level of summarization applied to each cluster representation. The proposed technique has been implemented into a visualization system called Sequence Cluster Explorer (Sequen-C) that allows multilevel and detail-on-demand exploration through three coordinated views, and the inspection of data attributes at cluster, unique sequence, and individual sequence level. We present two case studies using real-world datasets in the healthcare domain: CUREd and MIMIC-III; which demonstrate how the technique can aid users to obtain a summary of common and deviating pathways, and explore data attributes for selected patterns.
△ Less
Submitted 6 August, 2021;
originally announced August 2021.
-
Evaluation of Automated Image Descriptions for Visually Impaired Students
Authors:
Anett Hoppe,
David Morris,
Ralph Ewerth
Abstract:
Illustrations are widely used in education, and sometimes, alternatives are not available for visually impaired students. Therefore, those students would benefit greatly from an automatic illustration description system, but only if those descriptions were complete, correct, and easily understandable using a screenreader. In this paper, we report on a study for the assessment of automated image de…
▽ More
Illustrations are widely used in education, and sometimes, alternatives are not available for visually impaired students. Therefore, those students would benefit greatly from an automatic illustration description system, but only if those descriptions were complete, correct, and easily understandable using a screenreader. In this paper, we report on a study for the assessment of automated image descriptions. We interviewed experts to establish evaluation criteria, which we then used to create an evaluation questionnaire for sighted non-expert raters, and description templates. We used this questionnaire to evaluate the quality of descriptions which could be generated with a template-based automatic image describer. We present evidence that these templates have the potential to generate useful descriptions, and that the questionnaire identifies problems with description templates.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
Multi-Label Learning from Single Positive Labels
Authors:
Elijah Cole,
Oisin Mac Aodha,
Titouan Lorieul,
Pietro Perona,
Dan Morris,
Nebojsa Jojic
Abstract:
Predicting all applicable labels for a given image is known as multi-label classification. Compared to the standard multi-class case (where each image has only one label), it is considerably more challenging to annotate training data for multi-label classification. When the number of potential labels is large, human annotators find it difficult to mention all applicable labels for each training im…
▽ More
Predicting all applicable labels for a given image is known as multi-label classification. Compared to the standard multi-class case (where each image has only one label), it is considerably more challenging to annotate training data for multi-label classification. When the number of potential labels is large, human annotators find it difficult to mention all applicable labels for each training image. Furthermore, in some settings detection is intrinsically difficult e.g. finding small object instances in high resolution images. As a result, multi-label training data is often plagued by false negatives. We consider the hardest version of this problem, where annotators provide only one relevant label for each image. As a result, training sets will have only one positive label per image and no confirmed negatives. We explore this special case of learning from missing labels across four different multi-label image classification datasets for both linear classifiers and end-to-end fine-tuned deep networks. We extend existing multi-label losses to this setting and propose novel variants that constrain the number of expected positive labels during training. Surprisingly, we show that in some cases it is possible to approach the performance of fully labeled classifiers despite training with significantly fewer confirmed labels.
△ Less
Submitted 22 October, 2021; v1 submitted 17 June, 2021;
originally announced June 2021.
-
Radar-Camera Pixel Depth Association for Depth Completion
Authors:
Yunfei Long,
Daniel Morris,
Xiaoming Liu,
Marcos Castro,
Punarjay Chakravarty,
Praveen Narayanan
Abstract:
While radar and video data can be readily fused at the detection level, fusing them at the pixel level is potentially more beneficial. This is also more challenging in part due to the sparsity of radar, but also because automotive radar beams are much wider than a typical pixel combined with a large baseline between camera and radar, which results in poor association between radar pixels and color…
▽ More
While radar and video data can be readily fused at the detection level, fusing them at the pixel level is potentially more beneficial. This is also more challenging in part due to the sparsity of radar, but also because automotive radar beams are much wider than a typical pixel combined with a large baseline between camera and radar, which results in poor association between radar pixels and color pixel. A consequence is that depth completion methods designed for LiDAR and video fare poorly for radar and video. Here we propose a radar-to-pixel association stage which learns a map** from radar returns to pixels. This map** also serves to densify radar returns. Using this as a first stage, followed by a more traditional depth completion method, we are able to achieve image-guided depth completion with radar and video. We demonstrate performance superior to camera and radar alone on the nuScenes dataset. Our source code is available at https://github.com/longyunf/rc-pda.
△ Less
Submitted 4 June, 2021;
originally announced June 2021.
-
Depth Completion with Twin Surface Extrapolation at Occlusion Boundaries
Authors:
Saif Imran,
Xiaoming Liu,
Daniel Morris
Abstract:
Depth completion starts from a sparse set of known depth values and estimates the unknown depths for the remaining image pixels. Most methods model this as depth interpolation and erroneously interpolate depth pixels into the empty space between spatially distinct objects, resulting in depth-smearing across occlusion boundaries. Here we propose a multi-hypothesis depth representation that explicit…
▽ More
Depth completion starts from a sparse set of known depth values and estimates the unknown depths for the remaining image pixels. Most methods model this as depth interpolation and erroneously interpolate depth pixels into the empty space between spatially distinct objects, resulting in depth-smearing across occlusion boundaries. Here we propose a multi-hypothesis depth representation that explicitly models both foreground and background depths in the difficult occlusion-boundary regions. Our method can be thought of as performing twin-surface extrapolation, rather than interpolation, in these regions. Next our method fuses these extrapolated surfaces into a single depth image leveraging the image data. Key to our method is the use of an asymmetric loss function that operates on a novel twin-surface representation. This enables us to train a network to simultaneously do surface extrapolation and surface fusion. We characterize our loss function and compare with other common losses. Finally, we validate our method on three different datasets; KITTI, an outdoor real-world dataset, NYU2, indoor real-world depth dataset and Virtual KITTI, a photo-realistic synthetic dataset with dense groundtruth, and demonstrate improvement over the state of the art.
△ Less
Submitted 25 July, 2021; v1 submitted 5 April, 2021;
originally announced April 2021.
-
CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection
Authors:
Su Pang,
Daniel Morris,
Hayder Radha
Abstract:
There have been significant advances in neural networks for both 3D object detection using LiDAR and 2D object detection using video. However, it has been surprisingly difficult to train networks to effectively use both modalities in a way that demonstrates gain over single-modality networks. In this paper, we propose a novel Camera-LiDAR Object Candidates (CLOCs) fusion network. CLOCs fusion prov…
▽ More
There have been significant advances in neural networks for both 3D object detection using LiDAR and 2D object detection using video. However, it has been surprisingly difficult to train networks to effectively use both modalities in a way that demonstrates gain over single-modality networks. In this paper, we propose a novel Camera-LiDAR Object Candidates (CLOCs) fusion network. CLOCs fusion provides a low-complexity multi-modal fusion framework that significantly improves the performance of single-modality detectors. CLOCs operates on the combined output candidates before Non-Maximum Suppression (NMS) of any 2D and any 3D detector, and is trained to leverage their geometric and semantic consistencies to produce more accurate final 3D and 2D detection results. Our experimental evaluation on the challenging KITTI object detection benchmark, including 3D and bird's eye view metrics, shows significant improvements, especially at long distance, over the state-of-the-art fusion based methods. At time of submission, CLOCs ranks the highest among all the fusion-based methods in the official KITTI leaderboard. We will release our code upon acceptance.
△ Less
Submitted 1 September, 2020;
originally announced September 2020.
-
Sequence Information Channel Concatenation for Improving Camera Trap Image Burst Classification
Authors:
Bhuvan Malladihalli Shashidhara,
Darshan Mehta,
Yash Kale,
Dan Morris,
Megan Hazen
Abstract:
Camera Traps are extensively used to observe wildlife in their natural habitat without disturbing the ecosystem. This could help in the early detection of natural or human threats to animals, and help towards ecological conservation. Currently, a massive number of such camera traps have been deployed at various ecological conservation areas around the world, collecting data for decades, thereby re…
▽ More
Camera Traps are extensively used to observe wildlife in their natural habitat without disturbing the ecosystem. This could help in the early detection of natural or human threats to animals, and help towards ecological conservation. Currently, a massive number of such camera traps have been deployed at various ecological conservation areas around the world, collecting data for decades, thereby requiring automation to detect images containing animals. Existing systems perform classification to detect if images contain animals by considering a single image. However, due to challenging scenes with animals camouflaged in their natural habitat, it sometimes becomes difficult to identify the presence of animals from merely a single image. We hypothesize that a short burst of images instead of a single image, assuming that the animal moves, makes it much easier for a human as well as a machine to detect the presence of animals. In this work, we explore a variety of approaches, and measure the impact of using short image sequences (burst of 3 images) on improving the camera trap image classification. We show that concatenating masks containing sequence information and the images from the 3-image-burst across channels, improves the ROC AUC by 20% on a test-set from unseen camera-sites, as compared to an equivalent model that learns from a single image.
△ Less
Submitted 5 June, 2020; v1 submitted 30 April, 2020;
originally announced May 2020.
-
The GeoLifeCLEF 2020 Dataset
Authors:
Elijah Cole,
Benjamin Deneu,
Titouan Lorieul,
Maximilien Servajean,
Christophe Botella,
Dan Morris,
Nebojsa Jojic,
Pierre Bonnet,
Alexis Joly
Abstract:
Understanding the geographic distribution of species is a key concern in conservation. By pairing species occurrences with environmental features, researchers can model the relationship between an environment and the species which may be found there. To facilitate research in this area, we present the GeoLifeCLEF 2020 dataset, which consists of 1.9 million species observations paired with high-res…
▽ More
Understanding the geographic distribution of species is a key concern in conservation. By pairing species occurrences with environmental features, researchers can model the relationship between an environment and the species which may be found there. To facilitate research in this area, we present the GeoLifeCLEF 2020 dataset, which consists of 1.9 million species observations paired with high-resolution remote sensing imagery, land cover data, and altitude, in addition to traditional low-resolution climate and soil variables. We also discuss the GeoLifeCLEF 2020 competition, which aims to use this dataset to advance the state-of-the-art in location-based species recommendation.
△ Less
Submitted 8 April, 2020;
originally announced April 2020.
-
SlideImages: A Dataset for Educational Image Classification
Authors:
David Morris,
Eric Müller-Budack,
Ralph Ewerth
Abstract:
In the past few years, convolutional neural networks (CNNs) have achieved impressive results in computer vision tasks, which however mainly focus on photos with natural scene content. Besides, non-sensor derived images such as illustrations, data visualizations, figures, etc. are typically used to convey complex information or to explore large datasets. However, this kind of images has received li…
▽ More
In the past few years, convolutional neural networks (CNNs) have achieved impressive results in computer vision tasks, which however mainly focus on photos with natural scene content. Besides, non-sensor derived images such as illustrations, data visualizations, figures, etc. are typically used to convey complex information or to explore large datasets. However, this kind of images has received little attention in computer vision. CNNs and similar techniques use large volumes of training data. Currently, many document analysis systems are trained in part on scene images due to the lack of large datasets of educational image data. In this paper, we address this issue and present SlideImages, a dataset for the task of classifying educational illustrations. SlideImages contains training data collected from various sources, e.g., Wikimedia Commons and the AI2D dataset, and test data collected from educational slides. We have reserved all the actual educational images as a test dataset in order to ensure that the approaches using this dataset generalize well to new educational images, and potentially other domains. Furthermore, we present a baseline system using a standard deep neural architecture and discuss dealing with the challenge of limited training data.
△ Less
Submitted 19 January, 2020;
originally announced January 2020.
-
Local Context Normalization: Revisiting Local Normalization
Authors:
Anthony Ortiz,
Caleb Robinson,
Dan Morris,
Olac Fuentes,
Christopher Kiekintveld,
Md Mahmudulla Hassan,
Nebojsa Jojic
Abstract:
Normalization layers have been shown to improve convergence in deep neural networks, and even add useful inductive biases. In many vision applications the local spatial context of the features is important, but most common normalization schemes including Group Normalization (GN), Instance Normalization (IN), and Layer Normalization (LN) normalize over the entire spatial dimension of a feature. Thi…
▽ More
Normalization layers have been shown to improve convergence in deep neural networks, and even add useful inductive biases. In many vision applications the local spatial context of the features is important, but most common normalization schemes including Group Normalization (GN), Instance Normalization (IN), and Layer Normalization (LN) normalize over the entire spatial dimension of a feature. This can wash out important signals and degrade performance. For example, in applications that use satellite imagery, input images can be arbitrarily large; consequently, it is nonsensical to normalize over the entire area. Positional Normalization (PN), on the other hand, only normalizes over a single spatial position at a time. A natural compromise is to normalize features by local context, while also taking into account group level information. In this paper, we propose Local Context Normalization (LCN): a normalization layer where every feature is normalized based on a window around it and the filters in its group. We propose an algorithmic solution to make LCN efficient for arbitrary window sizes, even if every point in the image has a unique window. LCN outperforms its Batch Normalization (BN), GN, IN, and LN counterparts for object detection, semantic segmentation, and instance segmentation applications in several benchmark datasets, while kee** performance independent of the batch size and facilitating transfer learning.
△ Less
Submitted 9 May, 2020; v1 submitted 12 December, 2019;
originally announced December 2019.
-
A deep active learning system for species identification and counting in camera trap images
Authors:
Mohammad Sadegh Norouzzadeh,
Dan Morris,
Sara Beery,
Neel Joshi,
Nebojsa Jojic,
Jeff Clune
Abstract:
Biodiversity conservation depends on accurate, up-to-date information about wildlife population distributions. Motion-activated cameras, also known as camera traps, are a critical tool for population surveys, as they are cheap and non-intrusive. However, extracting useful information from camera trap images is a cumbersome process: a typical camera trap survey may produce millions of images that r…
▽ More
Biodiversity conservation depends on accurate, up-to-date information about wildlife population distributions. Motion-activated cameras, also known as camera traps, are a critical tool for population surveys, as they are cheap and non-intrusive. However, extracting useful information from camera trap images is a cumbersome process: a typical camera trap survey may produce millions of images that require slow, expensive manual review. Consequently, critical information is often lost due to resource limitations, and critical conservation questions may be answered too slowly to support decision-making. Computer vision is poised to dramatically increase efficiency in image-based biodiversity surveys, and recent studies have harnessed deep learning techniques for automatic information extraction from camera trap images. However, the accuracy of results depends on the amount, quality, and diversity of the data available to train models, and the literature has focused on projects with millions of relevant, labeled training images. Many camera trap projects do not have a large set of labeled images and hence cannot benefit from existing machine learning techniques. Furthermore, even projects that do have labeled data from similar ecosystems have struggled to adopt deep learning methods because image classification models overfit to specific image backgrounds (i.e., camera locations). In this paper, we focus not on automating the labeling of camera trap images, but on accelerating this process. We combine the power of machine intelligence and human intelligence to build a scalable, fast, and accurate active learning system to minimize the manual work required to identify and count animals in camera trap images. Our proposed scheme can match the state of the art accuracy on a 3.2 million image dataset with as few as 14,100 manual labels, which means decreasing manual labeling effort by over 99.5%.
△ Less
Submitted 21 October, 2019;
originally announced October 2019.
-
The iWildCam 2019 Challenge Dataset
Authors:
Sara Beery,
Dan Morris,
Pietro Perona
Abstract:
Camera Traps (or Wild Cams) enable the automatic collection of large quantities of image data. Biologists all over the world use camera traps to monitor biodiversity and population density of animal species. The computer vision community has been making strides towards automating the species classification challenge in camera traps, but as we try to expand the scope of these models from specific r…
▽ More
Camera Traps (or Wild Cams) enable the automatic collection of large quantities of image data. Biologists all over the world use camera traps to monitor biodiversity and population density of animal species. The computer vision community has been making strides towards automating the species classification challenge in camera traps, but as we try to expand the scope of these models from specific regions where we have collected training data to different areas we are faced with an interesting problem: how do you classify a species in a new region that you may not have seen in previous training data?
In order to tackle this problem, we have prepared a dataset and challenge where the training data and test data are from different regions, namely The American Southwest and the American Northwest. We use the Caltech Camera Traps dataset, collected from the American Southwest, as training data. We add a new dataset from the American Northwest, curated from data provided by the Idaho Department of Fish and Game (IDFG), as our test dataset. The test data has some class overlap with the training data, some species are found in both datasets, but there are both species seen during training that are not seen during test and vice versa. To help fill the gaps in the training species, we allow competitors to utilize transfer learning from two alternate domains: human-curated images from iNaturalist and synthetic images from Microsoft's TrapCam-AirSim simulation environment.
△ Less
Submitted 15 July, 2019;
originally announced July 2019.
-
Efficient Pipeline for Camera Trap Image Review
Authors:
Sara Beery,
Dan Morris,
Siyu Yang
Abstract:
Biologists all over the world use camera traps to monitor biodiversity and wildlife population density. The computer vision community has been making strides towards automating the species classification challenge in camera traps, but it has proven difficult to to apply models trained in one region to images collected in different geographic areas. In some cases, accuracy falls off catastrophicall…
▽ More
Biologists all over the world use camera traps to monitor biodiversity and wildlife population density. The computer vision community has been making strides towards automating the species classification challenge in camera traps, but it has proven difficult to to apply models trained in one region to images collected in different geographic areas. In some cases, accuracy falls off catastrophically in new region, due to both changes in background and the presence of previously-unseen species. We propose a pipeline that takes advantage of a pre-trained general animal detector and a smaller set of labeled images to train a classification model that can efficiently achieve accurate results in a new region.
△ Less
Submitted 15 July, 2019;
originally announced July 2019.
-
Human-Machine Collaboration for Fast Land Cover Map**
Authors:
Caleb Robinson,
Anthony Ortiz,
Kolya Malkin,
Blake Elias,
Andi Peng,
Dan Morris,
Bistra Dilkina,
Nebojsa Jojic
Abstract:
We propose incorporating human labelers in a model fine-tuning system that provides immediate user feedback. In our framework, human labelers can interactively query model predictions on unlabeled data, choose which data to label, and see the resulting effect on the model's predictions. This bi-directional feedback loop allows humans to learn how the model responds to new data. Our hypothesis is t…
▽ More
We propose incorporating human labelers in a model fine-tuning system that provides immediate user feedback. In our framework, human labelers can interactively query model predictions on unlabeled data, choose which data to label, and see the resulting effect on the model's predictions. This bi-directional feedback loop allows humans to learn how the model responds to new data. Our hypothesis is that this rich feedback allows human labelers to create mental models that enable them to better choose which biases to introduce to the model. We compare human-selected points to points selected using standard active learning methods. We further investigate how the fine-tuning methodology impacts the human labelers' performance. We implement this framework for fine-tuning high-resolution land cover segmentation models. Specifically, we fine-tune a deep neural network -- trained to segment high-resolution aerial imagery into different land cover classes in Maryland, USA -- to a new spatial area in New York, USA. The tight loop turns the algorithm and the human operator into a hybrid system that can produce land cover maps of a large area much more efficiently than the traditional workflows. Our framework has applications in geospatial machine learning settings where there is a practically limitless supply of unlabeled data, of which only a small fraction can feasibly be labeled through human efforts.
△ Less
Submitted 14 November, 2019; v1 submitted 10 June, 2019;
originally announced June 2019.
-
Bean Split Ratio for Dry Bean Canning Quality and Variety Analysis
Authors:
Yunfei Long,
Amber Bassett,
Karen Cichy,
Addie Thompson,
Daniel Morris
Abstract:
Splits on canned beans appear in the process of preparation and canning. Researchers are studying how they are influenced by cooking environment and genotype. However, there is no existing method to automatically quantify or to characterize the severity of splits. To solve this, we propose two measures: the Bean Split Ratio (BSR) that quantifies the overall severity of splits, and the Bean Split H…
▽ More
Splits on canned beans appear in the process of preparation and canning. Researchers are studying how they are influenced by cooking environment and genotype. However, there is no existing method to automatically quantify or to characterize the severity of splits. To solve this, we propose two measures: the Bean Split Ratio (BSR) that quantifies the overall severity of splits, and the Bean Split Histogram (BSH) that characterizes the size distribution of splits. We create a pixel-wise segmentation method to automatically estimate these measures from images. We also present a bean dataset of recombinant inbred lines of two genotypes, use the BSR and BSH to assess canning quality, and explore heritability of these properties.
△ Less
Submitted 1 May, 2019;
originally announced May 2019.
-
Synthetic Examples Improve Generalization for Rare Classes
Authors:
Sara Beery,
Yang Liu,
Dan Morris,
Jim Piavis,
Ashish Kapoor,
Markus Meister,
Neel Joshi,
Pietro Perona
Abstract:
The ability to detect and classify rare occurrences in images has important applications - for example, counting rare and endangered species when studying biodiversity, or detecting infrequent traffic scenarios that pose a danger to self-driving cars. Few-shot learning is an open problem: current computer vision systems struggle to categorize objects they have seen only rarely during training, and…
▽ More
The ability to detect and classify rare occurrences in images has important applications - for example, counting rare and endangered species when studying biodiversity, or detecting infrequent traffic scenarios that pose a danger to self-driving cars. Few-shot learning is an open problem: current computer vision systems struggle to categorize objects they have seen only rarely during training, and collecting a sufficient number of training examples of rare events is often challenging and expensive, and sometimes outright impossible. We explore in depth an approach to this problem: complementing the few available training images with ad-hoc simulated data.
Our testbed is animal species classification, which has a real-world long-tailed distribution. We analyze the effect of different axes of variation in simulation, such as pose, lighting, model, and simulation method, and we prescribe best practices for efficiently incorporating simulated data for real-world performance gain. Our experiments reveal that synthetic data can considerably reduce error rates for classes that are rare, that as the amount of simulated data is increased, accuracy on the target class improves, and that high variation of simulated data provides maximum performance gain.
△ Less
Submitted 14 May, 2019; v1 submitted 11 April, 2019;
originally announced April 2019.
-
Depth Coefficients for Depth Completion
Authors:
Saif Imran,
Yunfei Long,
Xiaoming Liu,
Daniel Morris
Abstract:
Depth completion involves estimating a dense depth image from sparse depth measurements, often guided by a color image. While linear upsampling is straight forward, it results in artifacts including depth pixels being interpolated in empty space across discontinuities between objects. Current methods use deep networks to upsample and "complete" the missing depth pixels. Nevertheless, depth smearin…
▽ More
Depth completion involves estimating a dense depth image from sparse depth measurements, often guided by a color image. While linear upsampling is straight forward, it results in artifacts including depth pixels being interpolated in empty space across discontinuities between objects. Current methods use deep networks to upsample and "complete" the missing depth pixels. Nevertheless, depth smearing between objects remains a challenge. We propose a new representation for depth called Depth Coefficients (DC) to address this problem. It enables convolutions to more easily avoid inter-object depth mixing. We also show that the standard Mean Squared Error (MSE) loss function can promote depth mixing, and thus propose instead to use cross-entropy loss for DC. With quantitative and qualitative evaluation on benchmarks, we show that switching out sparse depth input and MSE loss with our DC representation and cross-entropy loss is a simple way to improve depth completion performance, and reduce pixel depth mixing, which leads to improved depth-based object detection.
△ Less
Submitted 13 March, 2019;
originally announced March 2019.
-
A Pyramid CNN for Dense-Leaves Segmentation
Authors:
Daniel D. Morris
Abstract:
Automatic detection and segmentation of overlap** leaves in dense foliage can be a difficult task, particularly for leaves with strong textures and high occlusions. We present Dense-Leaves, an image dataset with ground truth segmentation labels that can be used to train and quantify algorithms for leaf segmentation in the wild. We also propose a pyramid convolutional neural network with multi-sc…
▽ More
Automatic detection and segmentation of overlap** leaves in dense foliage can be a difficult task, particularly for leaves with strong textures and high occlusions. We present Dense-Leaves, an image dataset with ground truth segmentation labels that can be used to train and quantify algorithms for leaf segmentation in the wild. We also propose a pyramid convolutional neural network with multi-scale predictions that detects and discriminates leaf boundaries from interior textures. Using these detected boundaries, closed-contour boundaries around individual leaves are estimated with a watershed-based algorithm. The result is an instance segmenter for dense leaves. Promising segmentation results for leaves in dense foliage are obtained.
△ Less
Submitted 4 April, 2018;
originally announced April 2018.
-
A View-Dependent Adaptive Matched Filter for LADAR-Based Vehicle Tracking
Authors:
Daniel D. Morris,
Regis Hoffman,
Paul Haley
Abstract:
LADARs mounted on mobile platforms produce a wealth of precise range data on the surrounding objects and vehicles. The challenge we address is to infer from these raw LADAR data the location and orientation of nearby vehicles. We propose a novel view-dependent adaptive matched filter for obtaining fast and precise measurements of target vehicle pose. We derive an analytic expression for the matchi…
▽ More
LADARs mounted on mobile platforms produce a wealth of precise range data on the surrounding objects and vehicles. The challenge we address is to infer from these raw LADAR data the location and orientation of nearby vehicles. We propose a novel view-dependent adaptive matched filter for obtaining fast and precise measurements of target vehicle pose. We derive an analytic expression for the matching function which we optimize to obtain target pose and size. Our algorithm is fast, robust and simple to implement compared to other methods. When used as the measurement component of a tracker on an autonomous ground vehicle, we are able to track in excess of 50 targets at 10 Hz. Once targets are aligned using our matched filter, we use a support vector-based discriminator to distinguish vehicles from other objects. This tracker provides a key sensing component for our autonomous ground vehicles which have accumulated hundreds of miles of on-road and off-road autonomous driving.
△ Less
Submitted 25 September, 2017;
originally announced September 2017.
-
LADAR-Based Vehicle Tracking and Trajectory Estimation for Urban Driving
Authors:
Daniel Morris,
Paul Haley,
William Zachar,
Steve McLean
Abstract:
Safe mobility for unmanned ground vehicles requires reliable detection of other vehicles, along with precise estimates of their locations and trajectories. Here we describe the algorithms and system we have developed for accurate trajectory estimation of nearby vehicles using an onboard scanning LADAR. We introduce a variable-axis Ackerman steering model and compare this to an independent steering…
▽ More
Safe mobility for unmanned ground vehicles requires reliable detection of other vehicles, along with precise estimates of their locations and trajectories. Here we describe the algorithms and system we have developed for accurate trajectory estimation of nearby vehicles using an onboard scanning LADAR. We introduce a variable-axis Ackerman steering model and compare this to an independent steering model. Then for robust tracking we propose a multi-hypothesis tracker that combines these kinematic models to leverage the strengths of each. When trajectories estimated with our techniques are input into a planner, they enable an unmanned vehicle to negotiate traffic in urban environments. Results have been evaluated running in real time on a moving vehicle with a scanning LADAR.
△ Less
Submitted 25 September, 2017;
originally announced September 2017.
-
LADAR-Based Mover Detection from Moving Vehicles
Authors:
Daniel D. Morris,
Brian Colonna,
Paul Haley
Abstract:
Detecting moving vehicles and people is crucial for safe operation of UGVs but is challenging in cluttered, real world environments. We propose a registration technique that enables objects to be robustly matched and tracked, and hence movers to be detected even in high clutter. Range data are acquired using a 2D scanning Ladar from a moving platform. These are automatically clustered into objects…
▽ More
Detecting moving vehicles and people is crucial for safe operation of UGVs but is challenging in cluttered, real world environments. We propose a registration technique that enables objects to be robustly matched and tracked, and hence movers to be detected even in high clutter. Range data are acquired using a 2D scanning Ladar from a moving platform. These are automatically clustered into objects and modeled using a surface density function. A Bhattacharya similarity is optimized to register subsequent views of each object enabling good discrimination and tracking, and hence mover detection.
△ Less
Submitted 25 September, 2017;
originally announced September 2017.
-
Use of Docker for deployment and testing of astronomy software
Authors:
D. Morris,
S. Voutsinas,
N. C. Hambly,
R. G. Mann
Abstract:
We describe preliminary investigations of using Docker for the deployment and testing of astronomy software. Docker is a relatively new containerisation technology that is develo** rapidly and being adopted across a range of domains. It is based upon virtualization at operating system level, which presents many advantages in comparison to the more traditional hardware virtualization that underpi…
▽ More
We describe preliminary investigations of using Docker for the deployment and testing of astronomy software. Docker is a relatively new containerisation technology that is develo** rapidly and being adopted across a range of domains. It is based upon virtualization at operating system level, which presents many advantages in comparison to the more traditional hardware virtualization that underpins most cloud computing infrastructure today. A particular strength of Docker is its simple format for describing and managing software containers, which has benefits for software developers, system administrators and end users.
We report on our experiences from two projects -- a simple activity to demonstrate how Docker works, and a more elaborate set of services that demonstrates more of its capabilities and what they can achieve within an astronomical context -- and include an account of how we solved problems through interaction with Docker's very active open source development community, which is currently the key to the most effective use of this rapidly-changing technology.
△ Less
Submitted 11 July, 2017;
originally announced July 2017.
-
Time Series Cube Data Model
Authors:
Jiri Nadvornik,
Petr Skoda,
Dave Morris,
Pavel Tvrdik
Abstract:
The purpose of this document is to create a data model and its serialization for expressing generic time series data. Already existing IVOA data models are reused as much as possible. The model is also made as generic as possible to be open to new extensions but at the same time closed for modifications. This enables maintaining interoperability throughout different versions of the data model. We…
▽ More
The purpose of this document is to create a data model and its serialization for expressing generic time series data. Already existing IVOA data models are reused as much as possible. The model is also made as generic as possible to be open to new extensions but at the same time closed for modifications. This enables maintaining interoperability throughout different versions of the data model. We define the necessary building blocks for metadata discovery, serialization of time series data and understanding it by clients. We present several categories of time series science cases with examples of implementation. We also take into account the most pressing topics for time series providers like tracking original images for every individual point of a light curve or time-derived axes like frequency for gravitational wave analysis. The main motivation for the creation of a new model is to provide a unified time series data publishing standard - not only for light curves but also more generic time series data, e.g., radial velocity curves, power spectra, hardness ratio, provenance linkage, etc. The flexibility is the most crucial part of our model - we are not dependent on any physical domain or frame models. While images or spectra are already stable and standardized products, the time series related domains are still not completely evolved and new ones will likely emerge in near future. That is why we need to keep models like Time Series Cube DM independent of any underlying physical models. In our opinion, this is the only correct and sustainable way for future development of IVOA standards.
△ Less
Submitted 11 January, 2018; v1 submitted 5 February, 2017;
originally announced February 2017.
-
Extremal sequences of polynomial complexity
Authors:
Kevin G. Hare,
Ian D. Morris,
Nikita Sidorov
Abstract:
The joint spectral radius of a bounded set of $d \times d$ real matrices is defined to be the maximum possible exponential growth rate of products of matrices drawn from that set. For a fixed set of matrices, a sequence of matrices drawn from that set is called \emph{extremal} if the associated sequence of partial products achieves this maximal rate of growth. An influential conjecture of J. Lagar…
▽ More
The joint spectral radius of a bounded set of $d \times d$ real matrices is defined to be the maximum possible exponential growth rate of products of matrices drawn from that set. For a fixed set of matrices, a sequence of matrices drawn from that set is called \emph{extremal} if the associated sequence of partial products achieves this maximal rate of growth. An influential conjecture of J. Lagarias and Y. Wang asked whether every finite set of matrices admits an extremal sequence which is periodic. This is equivalent to the assertion that every finite set of matrices admits an extremal sequence with bounded subword complexity. Counterexamples were subsequently constructed which have the property that every extremal sequence has at least linear subword complexity. In this paper we extend this result to show that for each integer $p \geq 1$, there exists a pair of square matrices of dimension $2^p(2^{p+1}-1)$ for which every extremal sequence has subword complexity at least $2^{-p^2}n^p$.
△ Less
Submitted 24 April, 2013; v1 submitted 30 January, 2012;
originally announced January 2012.
-
AstroDAbis: Annotations and Cross-Matches for Remote Catalogues
Authors:
Norman Gray,
Robert G Mann,
Dave Morris,
Mark Holliman,
Keith Noddle
Abstract:
Astronomers are good at sharing data, but poorer at sharing knowledge.
Almost all astronomical data ends up in open archives, and access to these is being simplified by the development of the global Virtual Observatory (VO). This is a great advance, but the fundamental problem remains that these archives contain only basic observational data, whereas all the astrophysical interpretation of that…
▽ More
Astronomers are good at sharing data, but poorer at sharing knowledge.
Almost all astronomical data ends up in open archives, and access to these is being simplified by the development of the global Virtual Observatory (VO). This is a great advance, but the fundamental problem remains that these archives contain only basic observational data, whereas all the astrophysical interpretation of that data -- which source is a quasar, which a low-mass star, and which an image artefact -- is contained in journal papers, with very little linkage back from the literature to the original data archives. It is therefore currently impossible for an astronomer to pose a query like "give me all sources in this data archive that have been identified as quasars" and this limits the effective exploitation of these archives, as the user of an archive has no direct means of taking advantage of the knowledge derived by its previous users.
The AstroDAbis service aims to address this, in a prototype service enabling astronomers to record annotations and cross-identifications in the AstroDAbis service, annotating objects in other catalogues. We have deployed two interfaces to the annotations, namely one astronomy-specific one using the TAP protocol}, and a second exploiting generic Linked Open Data (LOD) and RDF techniques.
△ Less
Submitted 25 November, 2011;
originally announced November 2011.
-
On a Devil's staircase associated to the joint spectral radii of a family of pairs of matrices
Authors:
Ian D. Morris,
Nikita Sidorov
Abstract:
The joint spectral radius of a finite set of real d x d matrices is defined to be the maximum possible exponential rate of growth of products of matrices drawn from that set. In previous work with K. G. Hare and J. Theys we showed that for a certain one-parameter family of pairs of matrices, this maximum possible rate of growth is attained along Sturmian sequences with a certain characteristic rat…
▽ More
The joint spectral radius of a finite set of real d x d matrices is defined to be the maximum possible exponential rate of growth of products of matrices drawn from that set. In previous work with K. G. Hare and J. Theys we showed that for a certain one-parameter family of pairs of matrices, this maximum possible rate of growth is attained along Sturmian sequences with a certain characteristic ratio which depends continuously upon the parameter. In this paper we answer some open questions from that paper by showing that the dependence of the ratio function upon the parameter takes the form of a Devil's staircase. We show in particular that this Devil's staircase attains every rational value strictly between 0 and 1 on some interval, and attains irrational values only in a set of Hausdorff dimension zero. This result generalises to include certain one-parameter families considered by other authors. We also give explicit formulas for the preimages of both rational and irrational numbers under the ratio function, thereby establishing a large family of pairs of matrices for which the joint spectral radius may be calculated exactly.
△ Less
Submitted 10 June, 2013; v1 submitted 18 July, 2011;
originally announced July 2011.
-
An explicit counterexample to the Lagarias-Wang finiteness conjecture
Authors:
Kevin G. Hare,
Ian D. Morris,
Nikita Sidorov,
Jacques Theys
Abstract:
The joint spectral radius of a finite set of real $d \times d$ matrices is defined to be the maximum possible exponential rate of growth of long products of matrices drawn from that set. A set of matrices is said to have the \emph{finiteness property} if there exists a periodic product which achieves this maximal rate of growth. J.C. Lagarias and Y. Wang conjectured in 1995 that every finite set o…
▽ More
The joint spectral radius of a finite set of real $d \times d$ matrices is defined to be the maximum possible exponential rate of growth of long products of matrices drawn from that set. A set of matrices is said to have the \emph{finiteness property} if there exists a periodic product which achieves this maximal rate of growth. J.C. Lagarias and Y. Wang conjectured in 1995 that every finite set of real $d \times d$ matrices satisfies the finiteness property. However, T. Bousch and J. Mairesse proved in 2002 that counterexamples to the finiteness conjecture exist, showing in particular that there exists a family of pairs of $2 \times 2$ matrices which contains a counterexample. Similar results were subsequently given by V.D. Blondel, J. Theys and A.A. Vladimirov and by V.S. Kozyakin, but no explicit counterexample to the finiteness conjecture has so far been given. The purpose of this paper is to resolve this issue by giving the first completely explicit description of a counterexample to the Lagarias-Wang finiteness conjecture. Namely, for the set \[ \mathsf{A}_{α_*}:= \{({cc}1&1\\0&1), α_*({cc}1&0\\1&1)\}\] we give an explicit value of
α_* \simeq 0.749326546330367557943961948091344672091327370236064317358024...] such that $\mathsf{A}_{α_*}$ does not satisfy the finiteness property.
△ Less
Submitted 11 November, 2010; v1 submitted 10 June, 2010;
originally announced June 2010.