Search | arXiv e-print repository

Plant Doctor: A hybrid machine learning and image segmentation software to quantify plant damage in video footage

Authors: Marc Josep Montagut Marques, Liu Mingxin, Kuri Thomas Shiojiri, Tomika Hagiwara, Kayo Hirose, Kaori Shiojiri, Shinjiro Umezu

Abstract: Artificial intelligence has significantly advanced the automation of diagnostic processes, benefiting various fields including agriculture. This study introduces an AI-based system for the automatic diagnosis of urban street plants using video footage obtained with accessible camera devices. The system aims to monitor plant health on a day-to-day basis, aiding in the control of disease spreading i… ▽ More Artificial intelligence has significantly advanced the automation of diagnostic processes, benefiting various fields including agriculture. This study introduces an AI-based system for the automatic diagnosis of urban street plants using video footage obtained with accessible camera devices. The system aims to monitor plant health on a day-to-day basis, aiding in the control of disease spreading in urban areas. By combining two machine vision algorithms, YOLOv8 and DeepSORT, the system efficiently identifies and tracks individual leaves, extracting the optimal images for health analysis. YOLOv8, chosen for its speed and computational efficiency, locates leaves, while DeepSORT ensures robust tracking in complex environments. For detailed health assessment, DeepLabV3Plus, a convolutional neural network, is employed to segment and quantify leaf damage caused by bacteria, pests, and fungi. The hybrid system, named Plant Doctor, has been trained and validated using a diverse dataset including footage from Tokyo urban plants. The results demonstrate the robustness and accuracy of the system in diagnosing leaf damage, with potential applications in large scale urban flora illness monitoring. This approach provides a non-invasive, efficient, and scalable solution for urban tree health management, supporting sustainable urban ecosystems. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 29 pages, 10 figures, 2 tables

arXiv:2407.00031 [pdf, other]

Supercharging Federated Learning with Flower and NVIDIA FLARE

Authors: Holger R. Roth, Daniel J. Beutel, Yan Cheng, Javier Fernandez Marques, Heng Pan, Chester Chen, Zhihong Zhang, Yuhong Wen, Sean Yang, Isaac, Yang, Yuan-Ting Hsieh, Ziyue Xu, Daguang Xu, Nicholas D. Lane, Andrew Feng

Abstract: Several open-source systems, such as Flower and NVIDIA FLARE, have been developed in recent years while focusing on different aspects of federated learning (FL). Flower is dedicated to implementing a cohesive approach to FL, analytics, and evaluation. Over time, Flower has cultivated extensive strategies and algorithms tailored for FL application development, fostering a vibrant FL community in re… ▽ More Several open-source systems, such as Flower and NVIDIA FLARE, have been developed in recent years while focusing on different aspects of federated learning (FL). Flower is dedicated to implementing a cohesive approach to FL, analytics, and evaluation. Over time, Flower has cultivated extensive strategies and algorithms tailored for FL application development, fostering a vibrant FL community in research and industry. Conversely, FLARE has prioritized the creation of an enterprise-ready, resilient runtime environment explicitly designed for FL applications in production environments. In this paper, we describe our initial integration of both frameworks and show how they can work together to supercharge the FL ecosystem as a whole. Through the seamless integration of Flower and FLARE, applications crafted within the Flower framework can effortlessly operate within the FLARE runtime environment without necessitating any modifications. This initial integration streamlines the process, eliminating complexities and ensuring smooth interoperability between the two platforms, thus enhancing the overall efficiency and accessibility of FL applications. △ Less

Submitted 21 May, 2024; originally announced July 2024.

arXiv:2406.17526 [pdf, other]

LumberChunker: Long-Form Narrative Document Segmentation

Authors: André V. Duarte, João Marques, Miguel Graça, Miguel Freire, Lei Li, Arlindo L. Oliveira

Abstract: Modern NLP tasks increasingly rely on dense retrieval methods to access up-to-date and relevant contextual information. We are motivated by the premise that retrieval benefits from segments that can vary in size such that a content's semantic independence is better captured. We propose LumberChunker, a method leveraging an LLM to dynamically segment documents, which iteratively prompts the LLM to… ▽ More Modern NLP tasks increasingly rely on dense retrieval methods to access up-to-date and relevant contextual information. We are motivated by the premise that retrieval benefits from segments that can vary in size such that a content's semantic independence is better captured. We propose LumberChunker, a method leveraging an LLM to dynamically segment documents, which iteratively prompts the LLM to identify the point within a group of sequential passages where the content begins to shift. To evaluate our method, we introduce GutenQA, a benchmark with 3000 "needle in a haystack" type of question-answer pairs derived from 100 public domain narrative books available on Project Gutenberg. Our experiments show that LumberChunker not only outperforms the most competitive baseline by 7.37% in retrieval performance (DCG@20) but also that, when integrated into a RAG pipeline, LumberChunker proves to be more effective than other chunking methods and competitive baselines, such as the Gemini 1.5M Pro. Our Code and Data are available at https://github.com/joaodsmarques/LumberChunker △ Less

Submitted 25 June, 2024; originally announced June 2024.

ACM Class: I.2

arXiv:2404.04189 [pdf, other]

Are We Up to the Challenge? An analysis of the FCC Broadband Data Collection Fixed Internet Availability Challenges

Authors: Jonatas Marques, Alexis Schrubbe, Nicole P. Marwell, Nick Feamster

Abstract: In 2021, the Broadband Equity, Access, and Deployment (BEAD) program allocated $42.45 billion to enhance high-speed internet access across the United States. As part of this funding initiative, The Federal Communications Commission (FCC) developed a national coverage map to guide the allocation of BEAD funds. This map was the key determinant to direct BEAD investments to areas in need of broadband… ▽ More In 2021, the Broadband Equity, Access, and Deployment (BEAD) program allocated $42.45 billion to enhance high-speed internet access across the United States. As part of this funding initiative, The Federal Communications Commission (FCC) developed a national coverage map to guide the allocation of BEAD funds. This map was the key determinant to direct BEAD investments to areas in need of broadband infrastructure improvements. The FCC encouraged public participation in refining this coverage map through the submission of "challenges" to either locations on the map or the status of broadband at any location on the map. These challenges allowed citizens and organizations to report discrepancies between the map's data and actual broadband availability, ensuring a more equitable distribution of funds. In this paper, we present a study analyzing the nature and distribution of these challenges across different access technologies and geographic areas. Among several other insights, we observe, for example, that the majority of challenges (about 58%) were submitted against terrestrial fixed wireless technologies as well as that the state of Nebraska had the strongest engagement in the challenge process with more than 75% of its broadband-serviceable locations having submitted at least one challenge. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: 9 pages, 14 tables, working draft

ACM Class: C.2.0; J.4; K.4

arXiv:2402.03694 [pdf, other]

ServeFlow: A Fast-Slow Model Architecture for Network Traffic Analysis

Authors: Shinan Liu, Ted Shaowang, Gerry Wan, Jeewon Chae, Jonatas Marques, Sanjay Krishnan, Nick Feamster

Abstract: Network traffic analysis increasingly uses complex machine learning models as the internet consolidates and traffic gets more encrypted. However, over high-bandwidth networks, flows can easily arrive faster than model inference rates. The temporal nature of network flows limits simple scale-out approaches leveraged in other high-traffic machine learning applications. Accordingly, this paper presen… ▽ More Network traffic analysis increasingly uses complex machine learning models as the internet consolidates and traffic gets more encrypted. However, over high-bandwidth networks, flows can easily arrive faster than model inference rates. The temporal nature of network flows limits simple scale-out approaches leveraged in other high-traffic machine learning applications. Accordingly, this paper presents ServeFlow, a solution for machine-learning model serving aimed at network traffic analysis tasks, which carefully selects the number of packets to collect and the models to apply for individual flows to achieve a balance between minimal latency, high service rate, and high accuracy. We identify that on the same task, inference time across models can differ by 2.7x-136.3x, while the median inter-packet waiting time is often 6-8 orders of magnitude higher than the inference time! ServeFlow is able to make inferences on 76.3% flows in under 16ms, which is a speed-up of 40.5x on the median end-to-end serving latency while increasing the service rate and maintaining similar accuracy. Even with thousands of features per flow, it achieves a service rate of over 48.5k new flows per second on a 16-core CPU commodity server, which matches the order of magnitude of flow rates observed on city-level network backbones. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2311.10018 [pdf, other]

On the Overconfidence Problem in Semantic 3D Map**

Authors: Joao Marcos Correia Marques, Albert Zhai, Shenlong Wang, Kris Hauser

Abstract: Semantic 3D map**, the process of fusing depth and image segmentation information between multiple views to build 3D maps annotated with object classes in real-time, is a recent topic of interest. This paper highlights the fusion overconfidence problem, in which conventional map** methods assign high confidence to the entire map even when they are incorrect, leading to miscalibrated outputs. S… ▽ More Semantic 3D map**, the process of fusing depth and image segmentation information between multiple views to build 3D maps annotated with object classes in real-time, is a recent topic of interest. This paper highlights the fusion overconfidence problem, in which conventional map** methods assign high confidence to the entire map even when they are incorrect, leading to miscalibrated outputs. Several methods to improve uncertainty calibration at different stages in the fusion pipeline are presented and compared on the ScanNet dataset. We show that the most widely used Bayesian fusion strategy is among the worst calibrated, and propose a learned pipeline that combines fusion and calibration, GLFS, which achieves simultaneously higher accuracy and 3D map calibration while retaining real-time capability. We further illustrate the importance of map calibration on a downstream task by showing that incorporating proper semantic fusion on a modular ObjectNav agent improves its success rates. Our code will be provided on Github for reproducibility upon acceptance. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: This is a preprint for the work submitted to the ICRA 2024 conference

ACM Class: I.2.9; I.2.10

arXiv:2309.10526 [pdf, other]

NSOAMT -- New Search Only Approach to Machine Translation

Authors: João Luís, Diogo Cardoso, José Marques, Luís Campos

Abstract: Translation automation mechanisms and tools have been developed for several years to bring people who speak different languages together. A "new search only approach to machine translation" was adopted to tackle some of the slowness and inaccuracy of the other technologies. The idea is to develop a solution that, by indexing an incremental set of words that combine a certain semantic meaning, make… ▽ More Translation automation mechanisms and tools have been developed for several years to bring people who speak different languages together. A "new search only approach to machine translation" was adopted to tackle some of the slowness and inaccuracy of the other technologies. The idea is to develop a solution that, by indexing an incremental set of words that combine a certain semantic meaning, makes it possible to create a process of correspondence between their native language record and the language of translation. This research principle assumes that the vocabulary used in a given type of publication/document is relatively limited in terms of language style and word diversity, which enhances the greater effect of instantaneously and rigor in the translation process through the indexing process. A volume of electronic text documents where processed and loaded into a database, and analyzed and measured in order confirm the previous premise. Although the observed and projected metric values did not give encouraging results, it was possible to develop and make available a translation tool using this approach. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: 17 pages, 13 figures, 12 tables

arXiv:2103.14137 [pdf, other]

Optimized Coverage Planning for UV Surface Disinfection

Authors: Joao Marcos Correia Marques, Ramya Ramalingam, Zherong Pan, Kris Hauser

Abstract: UV radiation has been used as a disinfection strategy to deactivate a wide range of pathogens, but existing irradiation strategies do not ensure sufficient exposure of all environmental surfaces and/or require long disinfection times. We present a near-optimal coverage planner for mobile UV disinfection robots. The formulation optimizes the irradiation time efficiency, while ensuring that a suffic… ▽ More UV radiation has been used as a disinfection strategy to deactivate a wide range of pathogens, but existing irradiation strategies do not ensure sufficient exposure of all environmental surfaces and/or require long disinfection times. We present a near-optimal coverage planner for mobile UV disinfection robots. The formulation optimizes the irradiation time efficiency, while ensuring that a sufficient dosage of radiation is received by each surface. The trajectory and dosage plan are optimized taking collision and light occlusion constraints into account. We propose a two-stage scheme to approximate the solution of the induced NP-hard optimization, and, for efficiency, perform key irradiance and occlusion calculations on a GPU. Empirical results show that our technique achieves more coverage for the same exposure time as strategies for existing UV robots, can be used to compare UV robot designs, and produces near-optimal plans. This is an extended version of the paper originally contributed to ICRA2021. △ Less

Submitted 25 March, 2021; originally announced March 2021.

Comments: 13 pages, 18 figures

ACM Class: I.2.8; I.2.9

arXiv:1712.02546 [pdf, other]

Distributed learning of CNNs on heterogeneous CPU/GPU architectures

Authors: Jose Marques, Gabriel Falcao, Luís A. Alexandre

Abstract: Convolutional Neural Networks (CNNs) have shown to be powerful classification tools in tasks that range from check reading to medical diagnosis, reaching close to human perception, and in some cases surpassing it. However, the problems to solve are becoming larger and more complex, which translates to larger CNNs, leading to longer training times that not even the adoption of Graphics Processing U… ▽ More Convolutional Neural Networks (CNNs) have shown to be powerful classification tools in tasks that range from check reading to medical diagnosis, reaching close to human perception, and in some cases surpassing it. However, the problems to solve are becoming larger and more complex, which translates to larger CNNs, leading to longer training times that not even the adoption of Graphics Processing Units (GPUs) could keep up to. This problem is partially solved by using more processing units and distributed training methods that are offered by several frameworks dedicated to neural network training. However, these techniques do not take full advantage of the possible parallelization offered by CNNs and the cooperative use of heterogeneous devices with different processing capabilities, clock speeds, memory size, among others. This paper presents a new method for the parallel training of CNNs that can be considered as a particular instantiation of model parallelism, where only the convolutional layer is distributed. In fact, the convolutions processed during training (forward and backward propagation included) represent from $60$-$90$\% of global processing time. The paper analyzes the influence of network size, bandwidth, batch size, number of devices, including their processing capabilities, and other parameters. Results show that this technique is capable of diminishing the training time without affecting the classification performance for both CPUs and GPUs. For the CIFAR-10 dataset, using a CNN with two convolutional layers, and $500$ and $1500$ kernels, respectively, best speedups achieve $3.28\times$ using four CPUs and $2.45\times$ with three GPUs. Modern imaging datasets, larger and more complex than CIFAR-10 will certainly require more than $60$-$90$\% of processing time calculating convolutions, and speedups will tend to increase accordingly. △ Less

Submitted 7 December, 2017; originally announced December 2017.

arXiv:1709.05324 [pdf, other]

Cystoid macular edema segmentation of Optical Coherence Tomography images using fully convolutional neural networks and fully connected CRFs

Authors: Fangliang Bai, Manuel J. Marques, Stuart J. Gibson

Abstract: In this paper we present a new method for cystoid macular edema (CME) segmentation in retinal Optical Coherence Tomography (OCT) images, using a fully convolutional neural network (FCN) and a fully connected conditional random fields (dense CRFs). As a first step, the framework trains the FCN model to extract features from retinal layers in OCT images, which exhibit CME, and then segments CME regi… ▽ More In this paper we present a new method for cystoid macular edema (CME) segmentation in retinal Optical Coherence Tomography (OCT) images, using a fully convolutional neural network (FCN) and a fully connected conditional random fields (dense CRFs). As a first step, the framework trains the FCN model to extract features from retinal layers in OCT images, which exhibit CME, and then segments CME regions using the trained model. Thereafter, dense CRFs are used to refine the segmentation according to the edema appearance. We have trained and tested the framework with OCT images from 10 patients with diabetic macular edema (DME). Our experimental results show that fluid and concrete macular edema areas were segmented with good adherence to boundaries. A segmentation accuracy of $0.61\pm 0.21$ (Dice coefficient) was achieved, with respect to the ground truth, which compares favourably with the previous state-of-the-art that used a kernel regression based method ($0.51\pm 0.34$). Our approach is versatile and we believe it can be easily adapted to detect other macular defects. △ Less

Submitted 15 September, 2017; originally announced September 2017.

arXiv:1203.2205 [pdf, other]

doi 10.1109/TMI.2011.2173698

Spread spectrum magnetic resonance imaging

Authors: Gilles Puy, Jose P. Marques, Rolf Gruetter, Jean-Philippe Thiran, Dimitri Van De Ville, Pierre Vandergheynst, Yves Wiaux

Abstract: We propose a novel compressed sensing technique to accelerate the magnetic resonance imaging (MRI) acquisition process. The method, coined spread spectrum MRI or simply s2MRI, consists of pre-modulating the signal of interest by a linear chirp before random k-space under-sampling, and then reconstructing the signal with non-linear algorithms that promote sparsity. The effectiveness of the procedur… ▽ More We propose a novel compressed sensing technique to accelerate the magnetic resonance imaging (MRI) acquisition process. The method, coined spread spectrum MRI or simply s2MRI, consists of pre-modulating the signal of interest by a linear chirp before random k-space under-sampling, and then reconstructing the signal with non-linear algorithms that promote sparsity. The effectiveness of the procedure is theoretically underpinned by the optimization of the coherence between the sparsity and sensing bases. The proposed technique is thoroughly studied by means of numerical simulations, as well as phantom and in vivo experiments on a 7T scanner. Our results suggest that s2MRI performs better than state-of-the-art variable density k-space under-sampling approaches △ Less

Submitted 9 March, 2012; originally announced March 2012.

Journal ref: IEEE Transactions on Medical Imaging, vol. 31(3), pp. 586-598, 2012

arXiv:0805.4680 [pdf, ps, other]

Telex: Principled System Support for Write-Sharing in Collaborative Applications

Authors: Lamia Benmouffok, Jean-Michel Busca, Joan Manuel Marquès, Marc Shapiro, Pierre Sutra, Georgios Tsoukalas

Abstract: The Telex system is designed for sharing mutable data in a distributed environment, particularly for collaborative applications. Users operate on their local, persistent replica of shared documents; they can work disconnected and suffer no network latency. The Telex approach to detect and correct conflicts is application independent, based on an action-constraint graph (ACG) that summarises the… ▽ More The Telex system is designed for sharing mutable data in a distributed environment, particularly for collaborative applications. Users operate on their local, persistent replica of shared documents; they can work disconnected and suffer no network latency. The Telex approach to detect and correct conflicts is application independent, based on an action-constraint graph (ACG) that summarises the concurrency semantics of applications. The ACG is stored efficiently in a multilog structure that eliminates contention and is optimised for locality. Telex supports multiple applications and multi-document updates. The Telex system clearly separates system logic (which includes replication, views, undo, security, consistency, conflicts, and commitment) from application logic. An example application is a shared calendar for managing multi-user meetings; the system detects meeting conflicts and resolves them consistently. △ Less

Submitted 10 June, 2008; v1 submitted 30 May, 2008; originally announced May 2008.

Report number: RR-6546

Showing 1–12 of 12 results for author: Marques, J