Author Guidelines for CVPR Proceedings

First Author
Institution1
Institution1 address
[email protected]
   Second Author
Institution2
First line of institution2 address
[email protected]
Abstract

Crop biomass, a critical indicator of plant growth, health, and productivity, is invaluable for crop breeding programs and agronomic research. However, the accurate and scalable quantification of crop biomass remains inaccessible due to limitations in existing measurement methods. One of the obstacles impeding the advancement of current crop biomass prediction methodologies is the scarcity of publicly available datasets. Addressing this gap, we introduce a new dataset in this domain, i.e. Multi-modality dataset for crop biomass estimation (MMCBE). Comprising 216 sets of multi-view drone images, coupled with LiDAR point clouds, and hand-labelled ground truth, MMCBE represents the first multi-modality one in the field. This dataset aims to establish benchmark methods for crop biomass quantification and foster the development of vision-based approaches. We have rigorously evaluated state-of-the-art crop biomass estimation methods using MMCBE and ventured into additional potential applications, such as 3D crop reconstruction from drone imagery and novel-view rendering. With this publication, we are making our comprehensive dataset available to the broader community.

1 Introduction

The development of advanced agricultural technologies is required to address the challenges of food security, which are intensified by a rapidly growing global population. Among the various traits monitored in sustainable agricultural systems, biomass (defined as the amount of organic matter produced by plants) is pivotal, as monitoring biomass aids in assessing the success of plant establishment and informs critical decisions regarding replanting or the application of treatments such as chemical sprays. The estimation and map** of biomass on a large scale, including up to global maps, have been extensively explored within the remote sensing community [DUNCANSON2022112845, bullock2023estimating]. These large-scale biomass estimates, primarily focused on vegetation biomass including forests, are typically conducted at resolutions ranging from 30 meters to 1 kilometre. However, this paper narrows its focus to the high-resolution estimation of crop biomass, and the term ‘biomass’ specifically refers to above-ground biomass (i.e. excluding roots) for agricultural crops, which is crucial for the precision required in contemporary agricultural monitoring systems. Accurate and automated estimation of crop biomass is a complex task that necessitates the integration of computer vision, robotics, and machine learning technologies. Despite the critical importance of this parameter, accurate and scalable measurement of crop biomass remains a challenge due to the limitations of existing methodologies [pan2022biomass, loudermilk2009ground].

Traditional methods for quantifying crop biomass typically involve the harvesting and measuring of biomass within a small, designated plot area (e.g., a square meter or a linear meter). These values are then extrapolated to estimate the biomass for an entire plot or field. Such approaches are inherently destructive and labour-intensive, limiting their scalability and eliminating the generation of temporal data on biomass changes over time. In contrast, advancements in computer vision technology present a non-destructive alternative, enabling the estimation of biomass across different time points [walter2019estimating, jimenez2018high, liu2018robust, caballer2021prediction, sun2018field, polo2009tractor, li2020real, ten2019biomass, pan2022biomass]. Specifically, Light Detection and Ranging (LiDAR) technology has been employed to capture critical plant metrics, such as height and point density, which serve as proxies for biomass estimation. While these proxy-based methods offer a degree of generalizability, their reliance on simplistic 3D feature representations and the challenge of canopy self-occlusion have historically limited their accuracy. To address these shortcomings, deep learning techniques have been integrated into the analysis of point clouds, significantly enhancing the precision of biomass estimations by overcoming the issue of self-occlusion and extracting more robust 3D features [pan2022biomass, ma2023automated]. This indicates that an increasing number of deep-learning-based approaches will might be developed for crop biomass prediction tasks.

The significance of comprehensive datasets, such as ImageNet [deng2009imagenet] and CoCo [lin2014microsoft]), to the advancement of convolutional neural networks (CNN) in image recognition tasks has been well-documented [ren2015faster, li2019detection, li2023efficient, resnet_simple]. An important driving force behind the development of deep learning methodologies is the availability of publicly accessible datasets, which not only offer expansive training data but also establish transparent benchmarks for evaluating performance. To further enhance deep-learning-based approaches for crop biomass prediction, it is critical to increase the availability of public-access datasets within the agricultural technology communities. However, despite the urgent need for such datasets, there is a notable scarcity in the field. To the best of our knowledge, there is only one publicly available crop dataset dedicated to LiDAR-based biomass estimation  [pan2022biomass]. To address this gap, we are contributing a new dataset for crop biomass prediction tasks, i.e. MMCBE, and it includes both point cloud and multi-view images with manually measured biomass ground truth, representing the first multi-modality dataset in this area. Our point cloud dataset can expand the training dataset for biomass prediction approaches and evaluate current methods. The multi-view image dataset, the first in this domain, aims to inspire further vision-based biomass prediction methods. These methods are particularly promising due to the widespread availability and affordability of RGB cameras and have the potential for large-scale application. Our main contributions in this paper are listed as follows:

  • We publish the first multi-modality dataset (i.e. MMCBE) which includes a set of 216 multi-view drone image sets with LiDAR point cloud and hand-labelled ground truth. The dataset will be released here, https://github.com/Benzlxs/MMCBE

  • We provide a benchmark for quantifying crop biomass estimation approaches.

  • We explored additional potential tasks, such as 3D crop reconstruction from multi-view images and novel-view synthesis.

In the rest of the paper, section LABEL:sec:related_work will present current main biomass prediction methods and other relevant datasets in this area, followed by the introduction about our multi-modality dataset and data processing pipeline in section LABEL:sec:dataset. We will evaluate state-of-the-art methods and explore other computer vision tasks on our dataset in section LABEL:sec:experiments, and then finish the paper with section LABEL:sec:conclusion.