Search | arXiv e-print repository

On the minimum spectral radius of connected graphs of given order and size

Authors: Sebastian M. Cioabă, Vishal Gupta, Celso Marques

Abstract: In this paper, we study a question of Hong from 1993 related to the minimum spectral radii of the adjacency matrices of connected graphs of given order and size. Hong asked if it is true that among all connected graphs of given number of vertices $n$ and number of edges $e$, the graphs having minimum spectral radius (the minimizer graphs) must be almost regular, meaning that the difference between… ▽ More In this paper, we study a question of Hong from 1993 related to the minimum spectral radii of the adjacency matrices of connected graphs of given order and size. Hong asked if it is true that among all connected graphs of given number of vertices $n$ and number of edges $e$, the graphs having minimum spectral radius (the minimizer graphs) must be almost regular, meaning that the difference between their maximum degree and their minimum degree is at most one. In this paper, we answer Hong's question positively for various values of $n$ and $e$ and in several cases, we determined the graphs with minimum spectral radius. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 19 pages, 6 figures

MSC Class: 05C50; 15A18

arXiv:2311.10018 [pdf, other]

On the Overconfidence Problem in Semantic 3D Map**

Authors: Joao Marcos Correia Marques, Albert Zhai, Shenlong Wang, Kris Hauser

Abstract: Semantic 3D map**, the process of fusing depth and image segmentation information between multiple views to build 3D maps annotated with object classes in real-time, is a recent topic of interest. This paper highlights the fusion overconfidence problem, in which conventional map** methods assign high confidence to the entire map even when they are incorrect, leading to miscalibrated outputs. S… ▽ More Semantic 3D map**, the process of fusing depth and image segmentation information between multiple views to build 3D maps annotated with object classes in real-time, is a recent topic of interest. This paper highlights the fusion overconfidence problem, in which conventional map** methods assign high confidence to the entire map even when they are incorrect, leading to miscalibrated outputs. Several methods to improve uncertainty calibration at different stages in the fusion pipeline are presented and compared on the ScanNet dataset. We show that the most widely used Bayesian fusion strategy is among the worst calibrated, and propose a learned pipeline that combines fusion and calibration, GLFS, which achieves simultaneously higher accuracy and 3D map calibration while retaining real-time capability. We further illustrate the importance of map calibration on a downstream task by showing that incorporating proper semantic fusion on a modular ObjectNav agent improves its success rates. Our code will be provided on Github for reproducibility upon acceptance. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: This is a preprint for the work submitted to the ICRA 2024 conference

ACM Class: I.2.9; I.2.10

arXiv:2307.13124 [pdf, ps, other]

Conformal prediction for frequency-severity modeling

Authors: Helton Graziadei, Paulo C. Marques F., Eduardo F. L. de Melo, Rodrigo S. Targino

Abstract: We present a nonparametric model-agnostic framework for building prediction intervals of insurance claims, with finite sample statistical guarantees, extending the technique of split conformal prediction to the domain of two-stage frequency-severity modeling. The effectiveness of the framework is showcased with simulated and real datasets. When the underlying severity model is a random forest, we… ▽ More We present a nonparametric model-agnostic framework for building prediction intervals of insurance claims, with finite sample statistical guarantees, extending the technique of split conformal prediction to the domain of two-stage frequency-severity modeling. The effectiveness of the framework is showcased with simulated and real datasets. When the underlying severity model is a random forest, we extend the two-stage split conformal prediction procedure, showing how the out-of-bag mechanism can be leveraged to eliminate the need for a calibration set and to enable the production of prediction intervals with adaptive width. △ Less

Submitted 27 July, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

arXiv:2303.02770 [pdf, other]

On the universal distribution of the coverage in split conformal prediction

Authors: Paulo C. Marques F.

Abstract: Two additional universal properties are established in the split conformal prediction framework. In a regression setting with exchangeable data, we determine the exact distribution of the coverage of prediction sets for a finite horizon of future observables, as well as the exact distribution of its almost sure limit. The results hold for finite training and calibration samples, and both distribut… ▽ More Two additional universal properties are established in the split conformal prediction framework. In a regression setting with exchangeable data, we determine the exact distribution of the coverage of prediction sets for a finite horizon of future observables, as well as the exact distribution of its almost sure limit. The results hold for finite training and calibration samples, and both distributions are determined solely by the nominal miscoverage level and the calibration sample size. △ Less

Submitted 5 March, 2023; originally announced March 2023.

Comments: 13 pages, 2 figures

arXiv:2112.06101 [pdf, ps, other]

Confidence intervals for the random forest generalization error

Authors: Paulo C. Marques F

Abstract: We show that the byproducts of the standard training process of a random forest yield not only the well known and almost computationally free out-of-bag point estimate of the model generalization error, but also give a direct path to compute confidence intervals for the generalization error which avoids processes of data splitting and model retraining. Besides the low computational cost involved i… ▽ More We show that the byproducts of the standard training process of a random forest yield not only the well known and almost computationally free out-of-bag point estimate of the model generalization error, but also give a direct path to compute confidence intervals for the generalization error which avoids processes of data splitting and model retraining. Besides the low computational cost involved in their construction, these confidence intervals are shown through simulations to have good coverage and appropriate shrinking rate of their width in terms of the training sample size. △ Less

Submitted 11 March, 2022; v1 submitted 11 December, 2021; originally announced December 2021.

Comments: 10 pages

arXiv:2103.14137 [pdf, other]

Optimized Coverage Planning for UV Surface Disinfection

Authors: Joao Marcos Correia Marques, Ramya Ramalingam, Zherong Pan, Kris Hauser

Abstract: UV radiation has been used as a disinfection strategy to deactivate a wide range of pathogens, but existing irradiation strategies do not ensure sufficient exposure of all environmental surfaces and/or require long disinfection times. We present a near-optimal coverage planner for mobile UV disinfection robots. The formulation optimizes the irradiation time efficiency, while ensuring that a suffic… ▽ More UV radiation has been used as a disinfection strategy to deactivate a wide range of pathogens, but existing irradiation strategies do not ensure sufficient exposure of all environmental surfaces and/or require long disinfection times. We present a near-optimal coverage planner for mobile UV disinfection robots. The formulation optimizes the irradiation time efficiency, while ensuring that a sufficient dosage of radiation is received by each surface. The trajectory and dosage plan are optimized taking collision and light occlusion constraints into account. We propose a two-stage scheme to approximate the solution of the induced NP-hard optimization, and, for efficiency, perform key irradiance and occlusion calculations on a GPU. Empirical results show that our technique achieves more coverage for the same exposure time as strategies for existing UV robots, can be used to compare UV robot designs, and produces near-optimal plans. This is an extended version of the paper originally contributed to ICRA2021. △ Less

Submitted 25 March, 2021; originally announced March 2021.

Comments: 13 pages, 18 figures

ACM Class: I.2.8; I.2.9

arXiv:1912.07661 [pdf, other]

It's easy to fool yourself: Case studies on identifying bias and confounding in bio-medical datasets

Authors: Subhashini Venugopalan, Arunachalam Narayanaswamy, Samuel Yang, Anton Geraschenko, Scott Lipnick, Nina Makhortova, James Hawrot, Christine Marques, Joao Pereira, Michael Brenner, Lee Rubin, Brian Wainger, Marc Berndl

Abstract: Confounding variables are a well known source of nuisance in biomedical studies. They present an even greater challenge when we combine them with black-box machine learning techniques that operate on raw data. This work presents two case studies. In one, we discovered biases arising from systematic errors in the data generation process. In the other, we found a spurious source of signal unrelated… ▽ More Confounding variables are a well known source of nuisance in biomedical studies. They present an even greater challenge when we combine them with black-box machine learning techniques that operate on raw data. This work presents two case studies. In one, we discovered biases arising from systematic errors in the data generation process. In the other, we found a spurious source of signal unrelated to the prediction task at hand. In both cases, our prediction models performed well but under careful examination hidden confounders and biases were revealed. These are cautionary tales on the limits of using machine learning techniques on raw data from scientific experiments. △ Less

Submitted 6 April, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

Comments: Accepted at Neurips 2019 LMRL workshop -- extended abstract track

arXiv:1703.10276 [pdf]

Análise comparativa de pesquisas de origens e destinos: uma abordagem baseada em Redes Complexas

Authors: Charles Marques, Carlos Caminha, Vasco Furtado

Abstract: In this paper, a comparative study was conducted between complex networks representing origin and destination survey data. Similarities were found between the characteristics of the networks of Brazilian cities with networks of foreign cities. Power laws were found in the distributions of edge weights and this scale - free behavior can occur due to the economic characteristics of the cities. In this paper, a comparative study was conducted between complex networks representing origin and destination survey data. Similarities were found between the characteristics of the networks of Brazilian cities with networks of foreign cities. Power laws were found in the distributions of edge weights and this scale - free behavior can occur due to the economic characteristics of the cities. △ Less

Submitted 29 March, 2017; originally announced March 2017.

Comments: 11 pages, in Portuguese, 2 figures

Showing 1–8 of 8 results for author: Marques, C