-
Exploring Optical Flow Inclusion into nnU-Net Framework for Surgical Instrument Segmentation
Authors:
Marcos Fernández-Rodríguez,
Bruno Silva,
Sandro Queirós,
Helena R. Torres,
Bruno Oliveira,
Pedro Morais,
Lukas R. Buschle,
Jorge Correia-Pinto,
Estevão Lima,
João L. Vilaça
Abstract:
Surgical instrument segmentation in laparoscopy is essential for computer-assisted surgical systems. Despite the Deep Learning progress in recent years, the dynamic setting of laparoscopic surgery still presents challenges for precise segmentation. The nnU-Net framework excelled in semantic segmentation analyzing single frames without temporal information. The framework's ease of use, including it…
▽ More
Surgical instrument segmentation in laparoscopy is essential for computer-assisted surgical systems. Despite the Deep Learning progress in recent years, the dynamic setting of laparoscopic surgery still presents challenges for precise segmentation. The nnU-Net framework excelled in semantic segmentation analyzing single frames without temporal information. The framework's ease of use, including its ability to be automatically configured, and its low expertise requirements, have made it a popular base framework for comparisons. Optical flow (OF) is a tool commonly used in video tasks to estimate motion and represent it in a single frame, containing temporal information. This work seeks to employ OF maps as an additional input to the nnU-Net architecture to improve its performance in the surgical instrument segmentation task, taking advantage of the fact that instruments are the main moving objects in the surgical field. With this new input, the temporal component would be indirectly added without modifying the architecture. Using CholecSeg8k dataset, three different representations of movement were estimated and used as new inputs, comparing them with a baseline model. Results showed that the use of OF maps improves the detection of classes with high movement, even when these are scarce in the dataset. To further improve performance, future work may focus on implementing other OF-preserving augmentations.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Model-Free Local Recalibration of Neural Networks
Authors:
R. Torres,
D. J. Nott,
S. A. Sisson,
T. Rodrigues,
J. G. Reis,
G. S. Rodrigues
Abstract:
Artificial neural networks (ANNs) are highly flexible predictive models. However, reliably quantifying uncertainty for their predictions is a continuing challenge. There has been much recent work on "recalibration" of predictive distributions for ANNs, so that forecast probabilities for events of interest are consistent with certain frequency evaluations of them. Uncalibrated probabilistic forecas…
▽ More
Artificial neural networks (ANNs) are highly flexible predictive models. However, reliably quantifying uncertainty for their predictions is a continuing challenge. There has been much recent work on "recalibration" of predictive distributions for ANNs, so that forecast probabilities for events of interest are consistent with certain frequency evaluations of them. Uncalibrated probabilistic forecasts are of limited use for many important decision-making tasks. To address this issue, we propose a localized recalibration of ANN predictive distributions using the dimension-reduced representation of the input provided by the ANN hidden layers. Our novel method draws inspiration from recalibration techniques used in the literature on approximate Bayesian computation and likelihood-free inference methods. Most existing calibration methods for ANNs can be thought of as calibrating either on the input layer, which is difficult when the input is high-dimensional, or the output layer, which may not be sufficiently flexible. Through a simulation study, we demonstrate that our method has good performance compared to alternative approaches, and explore the benefits that can be achieved by localizing the calibration based on different layers of the network. Finally, we apply our proposed method to a diamond price prediction problem, demonstrating the potential of our approach to improve prediction and uncertainty quantification in real-world applications.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Real-Time Emergency Vehicle Detection using Mel Spectrograms and Regular Expressions
Authors:
Alberto Pacheco-Gonzalez,
Raymundo Torres,
Raul Chacon,
Isidro Robledo
Abstract:
In emergency situations, the high-speed movement of an ambulance through the city streets can be hindered by vehicular traffic. This work presents a method for detecting emergency vehicle sirens in real time. To obtain the audio fingerprint of a Hi-Lo siren, DSP and signal symbolization techniques were applied, which were contrasted against an audio classifier based on a deep neural network, using…
▽ More
In emergency situations, the high-speed movement of an ambulance through the city streets can be hindered by vehicular traffic. This work presents a method for detecting emergency vehicle sirens in real time. To obtain the audio fingerprint of a Hi-Lo siren, DSP and signal symbolization techniques were applied, which were contrasted against an audio classifier based on a deep neural network, using the same 280 audios of ambient sounds and 52 Hi-Lo siren audios dataset. In both methods, some classification accuracy metrics were evaluated based on its confusion matrix, resulting in the DSP algorithm having a slightly lower accuracy than the DNN model, however, it offers a self-explanatory, adjustable, portable, high performance and lower energy and consumption that makes it a more viable lower cost ADAS implementation to identify Hi-Lo sirens in real time.
△ Less
Submitted 23 June, 2024; v1 submitted 25 September, 2023;
originally announced September 2023.
-
Data-driven Intra-Autonomous Systems Graph Generator
Authors:
Caio Vinicius Dadauto,
Nelson Luis Saldanha da Fonseca,
Ricardo da Silva Torres
Abstract:
Accurate modeling of realistic network topologies is essential for evaluating novel Internet solutions. Current topology generators, notably scale-free-based models, fail to capture multiple properties of intra-AS topologies. While scale-free networks encode node-degree distribution, they overlook crucial graph properties like betweenness, clustering, and assortativity. The limitations of existing…
▽ More
Accurate modeling of realistic network topologies is essential for evaluating novel Internet solutions. Current topology generators, notably scale-free-based models, fail to capture multiple properties of intra-AS topologies. While scale-free networks encode node-degree distribution, they overlook crucial graph properties like betweenness, clustering, and assortativity. The limitations of existing generators pose challenges for training and evaluating deep learning models in communication networks, emphasizing the need for advanced topology generators encompassing diverse Internet topology characteristics. This paper introduces a novel deep-learning-based generator of synthetic graphs representing intra-autonomous in the Internet, named Deep-Generative Graphs for the Internet (DGGI). It also presents a novel massive dataset of real intra-AS graphs extracted from the project ITDK, called IGraphs. It is shown that DGGI creates synthetic graphs that accurately reproduce the properties of centrality, clustering, assortativity, and node degree. The DGGI generator overperforms existing Internet topology generators. On average, DGGI improves the MMD metric $84.4\%$, $95.1\%$, $97.9\%$, and $94.7\%$ for assortativity, betweenness, clustering, and node degree, respectively.
△ Less
Submitted 26 February, 2024; v1 submitted 9 August, 2023;
originally announced August 2023.
-
Evaluating ChatGPT text-mining of clinical records for obesity monitoring
Authors:
Ivo S. Fins,
Heather Davies,
Sean Farrell,
Jose R. Torres,
Gina Pinchbeck,
Alan D. Radford,
Peter-John Noble
Abstract:
Background: Veterinary clinical narratives remain a largely untapped resource for addressing complex diseases. Here we compare the ability of a large language model (ChatGPT) and a previously developed regular expression (RegexT) to identify overweight body condition scores (BCS) in veterinary narratives. Methods: BCS values were extracted from 4,415 anonymised clinical narratives using either Reg…
▽ More
Background: Veterinary clinical narratives remain a largely untapped resource for addressing complex diseases. Here we compare the ability of a large language model (ChatGPT) and a previously developed regular expression (RegexT) to identify overweight body condition scores (BCS) in veterinary narratives. Methods: BCS values were extracted from 4,415 anonymised clinical narratives using either RegexT or by appending the narrative to a prompt sent to ChatGPT coercing the model to return the BCS information. Data were manually reviewed for comparison. Results: The precision of RegexT was higher (100%, 95% CI 94.81-100%) than the ChatGPT (89.3%; 95% CI82.75-93.64%). However, the recall of ChatGPT (100%. 95% CI 96.18-100%) was considerably higher than that of RegexT (72.6%, 95% CI 63.92-79.94%). Limitations: Subtle prompt engineering is needed to improve ChatGPT output. Conclusions: Large language models create diverse opportunities and, whilst complex, present an intuitive interface to information but require careful implementation to avoid unpredictable errors.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
On the Generalized Mean Densest Subgraph Problem: Complexity and Algorithms
Authors:
Chandra Chekuri,
Manuel R. Torres
Abstract:
Dense subgraph discovery is an important problem in graph mining and network analysis with several applications. Two canonical problems here are to find a maxcore (subgraph of maximum min degree) and to find a densest subgraph (subgraph of maximum average degree). Both of these problems can be solved in polynomial time. Veldt, Benson, and Kleinberg [VBK21] introduced the generalized $p$-mean dense…
▽ More
Dense subgraph discovery is an important problem in graph mining and network analysis with several applications. Two canonical problems here are to find a maxcore (subgraph of maximum min degree) and to find a densest subgraph (subgraph of maximum average degree). Both of these problems can be solved in polynomial time. Veldt, Benson, and Kleinberg [VBK21] introduced the generalized $p$-mean densest subgraph problem which captures the maxcore problem when $p=-\infty$ and the densest subgraph problem when $p=1$. They observed that the objective leads to a supermodular function when $p \ge 1$ and hence can be solved in polynomial time; for this case, they also developed a simple greedy peeling algorithm with a bounded approximation ratio. In this paper, we make several contributions. First, we prove that for any $p \in (-\frac{1}{8}, 0) \cup (0, \frac{1}{4})$ the problem is NP-Hard and for any $p \in (-3,0) \cup (0,1)$ the weighted version of the problem is NP-Hard, partly resolving a question left open in [VBK21]. Second, we describe two simple $1/2$-approximation algorithms for all $p < 1$, and show that our analysis of these algorithms is tight. For $p > 1$ we develop a fast near-linear time implementation of the greedy peeling algorithm from [VBK21]. This allows us to plug it into the iterative peeling algorithm that was shown to converge to an optimum solution [CQT22]. We demonstrate the efficacy of our algorithms by running extensive experiments on large graphs. Together, our results provide a comprehensive understanding of the complexity of the $p$-mean densest subgraph problem and lead to fast and provably good algorithms for the full range of $p$.
△ Less
Submitted 3 June, 2023;
originally announced June 2023.
-
Why is the winner the best?
Authors:
Matthias Eisenmann,
Annika Reinke,
Vivienn Weru,
Minu Dietlinde Tizabi,
Fabian Isensee,
Tim J. Adler,
Sharib Ali,
Vincent Andrearczyk,
Marc Aubreville,
Ujjwal Baid,
Spyridon Bakas,
Niranjan Balu,
Sophia Bano,
Jorge Bernal,
Sebastian Bodenstedt,
Alessandro Casella,
Veronika Cheplygina,
Marie Daum,
Marleen de Bruijne,
Adrien Depeursinge,
Reuben Dorent,
Jan Egger,
David G. Ellis,
Sandy Engelhardt,
Melanie Ganz
, et al. (100 additional authors not shown)
Abstract:
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To addre…
▽ More
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.
△ Less
Submitted 30 March, 2023;
originally announced March 2023.
-
CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection
Authors:
Chinedu Innocent Nwoye,
Tong Yu,
Saurav Sharma,
Aditya Murali,
Deepak Alapatt,
Armine Vardazaryan,
Kun Yuan,
Jonas Hajek,
Wolfgang Reiter,
Amine Yamlahi,
Finn-Henri Smidt,
Xiaoyang Zou,
Guoyan Zheng,
Bruno Oliveira,
Helena R. Torres,
Satoshi Kondo,
Satoshi Kasai,
Felix Holm,
Ege Özsoy,
Shuangchun Gui,
Han Li,
Sista Raviteja,
Rachana Sathish,
Pranav Poudel,
Binod Bhattarai
, et al. (24 additional authors not shown)
Abstract:
Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier effor…
▽ More
Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier efforts and the CholecTriplet challenge introduced in 2021 have put together techniques aimed at recognizing these triplets from surgical footage. Estimating also the spatial locations of the triplets would offer a more precise intraoperative context-aware decision support for computer-assisted intervention. This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection. It includes weakly-supervised bounding box localization of every visible surgical instrument (or tool), as the key actors, and the modeling of each tool-activity in the form of <instrument, verb, target> triplet. The paper describes a baseline method and 10 new deep learning algorithms presented at the challenge to solve the task. It also provides thorough methodological comparisons of the methods, an in-depth analysis of the obtained results across multiple metrics, visual and procedural challenges; their significance, and useful insights for future research directions and applications in surgery.
△ Less
Submitted 14 July, 2023; v1 submitted 13 February, 2023;
originally announced February 2023.
-
Language statistics at different spatial, temporal, and grammatical scales
Authors:
Fernanda Sánchez-Puig,
Rogelio Lozano-Aranda,
Dante Pérez-Méndez,
Ewan Colman,
Alfredo J. Morales-Guzmán,
Carlos Pineda,
Pedro Juan Rivera Torres,
Carlos Gershenson
Abstract:
Statistical linguistics has advanced considerably in recent decades as data has become available. This has allowed researchers to study how statistical properties of languages change over time. In this work, we use data from Twitter to explore English and Spanish considering the rank diversity at different scales: temporal (from 3 to 96 hour intervals), spatial (from 3km to 3000+km radii), and gra…
▽ More
Statistical linguistics has advanced considerably in recent decades as data has become available. This has allowed researchers to study how statistical properties of languages change over time. In this work, we use data from Twitter to explore English and Spanish considering the rank diversity at different scales: temporal (from 3 to 96 hour intervals), spatial (from 3km to 3000+km radii), and grammatical (from monograms to pentagrams). We find that all three scales are relevant. However, the greatest changes come from variations in the grammatical scale. At the lowest grammatical scale (monograms), the rank diversity curves are most similar, independently on the values of other scales, languages, and countries. As the grammatical scale grows, the rank diversity curves vary more depending on the temporal and spatial scales, as well as on the language and country. We also study the statistics of Twitter-specific tokens: emojis, hashtags, and user mentions. These particular type of tokens show a sigmoid kind of behaviour as a rank diversity function. Our results are helpful to quantify aspects of language statistics that seem universal and what may lead to variations.
△ Less
Submitted 26 July, 2022; v1 submitted 1 July, 2022;
originally announced July 2022.
-
Fetal Brain Tissue Annotation and Segmentation Challenge Results
Authors:
Kelly Payette,
Hongwei Li,
Priscille de Dumast,
Roxane Licandro,
Hui Ji,
Md Mahfuzur Rahman Siddiquee,
Daguang Xu,
Andriy Myronenko,
Hao Liu,
Yuchen Pei,
Lisheng Wang,
Ying Peng,
Juanying Xie,
Huiquan Zhang,
Guiming Dong,
Hao Fu,
Guotai Wang,
ZunHyan Rieu,
Donghyeon Kim,
Hyun Gi Kim,
Davood Karimi,
Ali Gholipour,
Helena R. Torres,
Bruno Oliveira,
João L. Vilaça
, et al. (33 additional authors not shown)
Abstract:
In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the develo** human brain. Automatic segmentation of the develo** fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variabili…
▽ More
In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the develo** human brain. Automatic segmentation of the develo** fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variability. Therefore, we organized the Fetal Tissue Annotation (FeTA) Challenge in 2021 in order to encourage the development of automatic segmentation algorithms on an international level. The challenge utilized FeTA Dataset, an open dataset of fetal brain MRI reconstructions segmented into seven different tissues (external cerebrospinal fluid, grey matter, white matter, ventricles, cerebellum, brainstem, deep grey matter). 20 international teams participated in this challenge, submitting a total of 21 algorithms for evaluation. In this paper, we provide a detailed analysis of the results from both a technical and clinical perspective. All participants relied on deep learning methods, mainly U-Nets, with some variability present in the network architecture, optimization, and image pre- and post-processing. The majority of teams used existing medical imaging deep learning frameworks. The main differences between the submissions were the fine tuning done during training, and the specific pre- and post-processing steps performed. The challenge results showed that almost all submissions performed similarly. Four of the top five teams used ensemble learning methods. However, one team's algorithm performed significantly superior to the other submissions, and consisted of an asymmetrical U-Net network architecture. This paper provides a first of its kind benchmark for future automatic multi-tissue segmentation algorithms for the develo** human brain in utero.
△ Less
Submitted 20 April, 2022;
originally announced April 2022.
-
CholecTriplet2021: A benchmark challenge for surgical action triplet recognition
Authors:
Chinedu Innocent Nwoye,
Deepak Alapatt,
Tong Yu,
Armine Vardazaryan,
Fangfang Xia,
Zixuan Zhao,
Tong Xia,
Fucang Jia,
Yuxuan Yang,
Hao Wang,
Derong Yu,
Guoyan Zheng,
Xiaotian Duan,
Neil Getty,
Ricardo Sanchez-Matilla,
Maria Robu,
Li Zhang,
Huabin Chen,
Jiacheng Wang,
Liansheng Wang,
Bokai Zhang,
Beerend Gerats,
Sista Raviteja,
Rachana Sathish,
Rong Tao
, et al. (37 additional authors not shown)
Abstract:
Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in…
▽ More
Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing surgical actions as triplets of <instrument, verb, target> combination delivers comprehensive details about the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet information. In this paper, we present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the challenge organizers and 19 new deep learning algorithms by competing teams are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches, performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition which is of utmost importance for the development of AI in surgery.
△ Less
Submitted 29 December, 2022; v1 submitted 10 April, 2022;
originally announced April 2022.
-
Black-box Error Diagnosis in Deep Neural Networks for Computer Vision: a Survey of Tools
Authors:
Piero Fraternali,
Federico Milani,
Rocio Nahime Torres,
Niccolò Zangrando
Abstract:
The application of Deep Neural Networks (DNNs) to a broad variety of tasks demands methods for co** with the complex and opaque nature of these architectures. When a gold standard is available, performance assessment treats the DNN as a black box and computes standard metrics based on the comparison of the predictions with the ground truth. A deeper understanding of performances requires going b…
▽ More
The application of Deep Neural Networks (DNNs) to a broad variety of tasks demands methods for co** with the complex and opaque nature of these architectures. When a gold standard is available, performance assessment treats the DNN as a black box and computes standard metrics based on the comparison of the predictions with the ground truth. A deeper understanding of performances requires going beyond such evaluation metrics to diagnose the model behavior and the prediction errors. This goal can be pursued in two complementary ways. On one side, model interpretation techniques "open the box" and assess the relationship between the input, the inner layers and the output, so as to identify the architecture modules most likely to cause the performance loss. On the other hand, black-box error diagnosis techniques study the correlation between the model response and some properties of the input not used for training, so as to identify the features of the inputs that make the model fail. Both approaches give hints on how to improve the architecture and/or the training process. This paper focuses on the application of DNNs to Computer Vision (CV) tasks and presents a survey of the tools that support the black-box performance diagnosis paradigm. It illustrates the features and gaps of the current proposals, discusses the relevant research directions and provides a brief overview of the diagnosis tools in sectors other than CV.
△ Less
Submitted 22 December, 2022; v1 submitted 17 January, 2022;
originally announced January 2022.
-
Efficient Deep Learning Architectures for Fast Identification of Bacterial Strains in Resource-Constrained Devices
Authors:
R. Gallardo García,
S. Jarquín Rodríguez,
B. Beltrán Martínez,
C. Hernández Gracidas,
R. Martínez Torres
Abstract:
This work presents twelve fine-tuned deep learning architectures to solve the bacterial classification problem over the Digital Image of Bacterial Species Dataset. The base architectures were mainly published as mobile or efficient solutions to the ImageNet challenge, and all experiments presented in this work consisted of making several modifications to the original designs, in order to make them…
▽ More
This work presents twelve fine-tuned deep learning architectures to solve the bacterial classification problem over the Digital Image of Bacterial Species Dataset. The base architectures were mainly published as mobile or efficient solutions to the ImageNet challenge, and all experiments presented in this work consisted of making several modifications to the original designs, in order to make them able to solve the bacterial classification problem by using fine-tuning and transfer learning techniques. This work also proposes a novel data augmentation technique for this dataset, which is based on the idea of artificial zooming, strongly increasing the performance of every tested architecture, even doubling it in some cases. In order to get robust and complete evaluations, all experiments were performed with 10-fold cross-validation and evaluated with five different metrics: top-1 and top-5 accuracy, precision, recall, and F1 score. This paper presents a complete comparison of the twelve different architectures, cross-validated with the original and the augmented version of the dataset, the results are also compared with several literature methods. Overall, eight of the eleven architectures surpassed the 0.95 scores in top-1 accuracy with our data augmentation method, being 0.9738 the highest top-1 accuracy. The impact of the data augmentation technique is reported with relative improvement scores.
△ Less
Submitted 11 June, 2021;
originally announced June 2021.
-
Measuring economic activity from space: a case study using flying airplanes and COVID-19
Authors:
Mauricio Pamplona Segundo,
Allan Pinto,
Rodrigo Minetto,
Ricardo da Silva Torres,
Sudeep Sarkar
Abstract:
This work introduces a novel solution to measure economic activity through remote sensing for a wide range of spatial areas. We hypothesized that disturbances in human behavior caused by major life-changing events leave signatures in satellite imagery that allows devising relevant image-based indicators to estimate their impacts and support decision-makers. We present a case study for the COVID-19…
▽ More
This work introduces a novel solution to measure economic activity through remote sensing for a wide range of spatial areas. We hypothesized that disturbances in human behavior caused by major life-changing events leave signatures in satellite imagery that allows devising relevant image-based indicators to estimate their impacts and support decision-makers. We present a case study for the COVID-19 coronavirus outbreak, which imposed severe mobility restrictions and caused worldwide disruptions, using flying airplane detection around the 30 busiest airports in Europe to quantify and analyze the lockdown's effects and post-lockdown recovery. Our solution won the Rapid Action Coronavirus Earth observation (RACE) upscaling challenge, sponsored by the European Space Agency and the European Commission, and now integrates the RACE dashboard. This platform combines satellite data and artificial intelligence to promote a progressive and safe reopening of essential activities. Code and CNN models are available at https://github.com/maups/covid19-custom-script-contest
△ Less
Submitted 21 April, 2021;
originally announced April 2021.
-
Reinforcement Learning with Probabilistic Boolean Network Models of Smart Grid Devices
Authors:
Pedro J. Rivera Torres,
Carlos Gershenson García,
Samir Kanaan Izquierdo
Abstract:
The area of Smart Power Grids needs to constantly improve its efficiency and resilience, to pro-vide high quality electrical power, in a resistant grid, managing faults and avoiding failures. Achieving this requires high component reliability, adequate maintenance, and a studied failure occurrence. Correct system operation involves those activities, and novel methodologies to detect, classify, and…
▽ More
The area of Smart Power Grids needs to constantly improve its efficiency and resilience, to pro-vide high quality electrical power, in a resistant grid, managing faults and avoiding failures. Achieving this requires high component reliability, adequate maintenance, and a studied failure occurrence. Correct system operation involves those activities, and novel methodologies to detect, classify, and isolate faults and failures, model and simulate processes with predictive algorithms and analytics (using data analysis and asset condition to plan and perform activities). We show-case the application of a complex-adaptive, self-organizing modeling method, Probabilistic Boolean Networks (PBN), as a way towards the understanding of the dynamics of smart grid devices, and to model and characterize their behavior. This work demonstrates that PBNs are is equivalent to the standard Reinforcement Learning Cycle, in which the agent/model has an inter-action with its environment and receives feedback from it in the form of a reward signal. Differ-ent reward structures were created in order to characterize preferred behavior. This information can be used to guide the PBN to avoid fault conditions and failures.
△ Less
Submitted 1 February, 2021;
originally announced February 2021.
-
A Soft Computing Approach for Selecting and Combining Spectral Bands
Authors:
Juan F. H. Albarracín,
Rafael S. Oliveira,
Marina Hirota,
Jefersson A. dos Santos,
Ricardo da S. Torres
Abstract:
We introduce a soft computing approach for automatically selecting and combining indices from remote sensing multispectral images that can be used for classification tasks. The proposed approach is based on a Genetic-Programming (GP) framework, a technique successfully used in a wide variety of optimization problems. Through GP, it is possible to learn indices that maximize the separability of sam…
▽ More
We introduce a soft computing approach for automatically selecting and combining indices from remote sensing multispectral images that can be used for classification tasks. The proposed approach is based on a Genetic-Programming (GP) framework, a technique successfully used in a wide variety of optimization problems. Through GP, it is possible to learn indices that maximize the separability of samples from two different classes. Once the indices specialized for all the pairs of classes are obtained, they are used in pixelwise classification tasks. We used the GP-based solution to evaluate complex classification problems, such as those that are related to the discrimination of vegetation types within and between tropical biomes. Using time series defined in terms of the learned spectral indices, we show that the GP framework leads to superior results than other indices that are used to discriminate and classify tropical biomes.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
Fast Approximation Algorithms for Bounded Degree and Crossing Spanning Tree Problems
Authors:
Chandra Chekuri,
Kent Quanrud,
Manuel R. Torres
Abstract:
We develop fast approximation algorithms for the minimum-cost version of the Bounded-Degree MST problem (BD-MST) and its generalization the Crossing Spanning Tree problem (Crossing-ST). We solve the underlying LP to within a $(1+ε)$ approximation factor in near-linear time via the multiplicative weight update (MWU) technique. This yields, in particular, a near-linear time algorithm that outputs an…
▽ More
We develop fast approximation algorithms for the minimum-cost version of the Bounded-Degree MST problem (BD-MST) and its generalization the Crossing Spanning Tree problem (Crossing-ST). We solve the underlying LP to within a $(1+ε)$ approximation factor in near-linear time via the multiplicative weight update (MWU) technique. This yields, in particular, a near-linear time algorithm that outputs an estimate $B$ such that $B \le B^* \le \lceil (1+ε)B \rceil +1$ where $B^*$ is the minimum-degree of a spanning tree of a given graph. To round the fractional solution, in our main technical contribution, we describe a fast near-linear time implementation of swap-rounding in the spanning tree polytope of a graph. The fractional solution can also be used to sparsify the input graph that can in turn be used to speed up existing combinatorial algorithms. Together, these ideas lead to significantly faster approximation algorithms than known before for the two problems of interest. In addition, a fast algorithm for swap rounding in the graphic matroid is a generic tool that has other applications, including to TSP and submodular function maximization.
△ Less
Submitted 17 May, 2021; v1 submitted 6 November, 2020;
originally announced November 2020.
-
Principled Interpolation in Normalizing Flows
Authors:
Samuel G. Fadel,
Sebastian Mair,
Ricardo da S. Torres,
Ulf Brefeld
Abstract:
Generative models based on normalizing flows are very successful in modeling complex data distributions using simpler ones. However, straightforward linear interpolations show unexpected side effects, as interpolation paths lie outside the area where samples are observed. This is caused by the standard choice of Gaussian base distributions and can be seen in the norms of the interpolated samples.…
▽ More
Generative models based on normalizing flows are very successful in modeling complex data distributions using simpler ones. However, straightforward linear interpolations show unexpected side effects, as interpolation paths lie outside the area where samples are observed. This is caused by the standard choice of Gaussian base distributions and can be seen in the norms of the interpolated samples. This observation suggests that correcting the norm should generally result in better interpolations, but it is not clear how to correct the norm in an unambiguous way. In this paper, we solve this issue by enforcing a fixed norm and, hence, change the base distribution, to allow for a principled way of interpolation. Specifically, we use the Dirichlet and von Mises-Fisher base distributions. Our experimental results show superior performance in terms of bits per dimension, Fréchet Inception Distance (FID), and Kernel Inception Distance (KID) scores for interpolation, while maintaining the same generative performance.
△ Less
Submitted 22 October, 2020;
originally announced October 2020.
-
Parallax Motion Effect Generation Through Instance Segmentation And Depth Estimation
Authors:
Allan Pinto,
Manuel A. Córdova,
Luis G. L. Decker,
Jose L. Flores-Campana,
Marcos R. Souza,
Andreza A. dos Santos,
Jhonatas S. Conceição,
Henrique F. Gagliardi,
Diogo C. Luvizon,
Ricardo da S. Torres,
Helio Pedrini
Abstract:
Stereo vision is a growing topic in computer vision due to the innumerable opportunities and applications this technology offers for the development of modern solutions, such as virtual and augmented reality applications. To enhance the user's experience in three-dimensional virtual environments, the motion parallax estimation is a promising technique to achieve this objective. In this paper, we p…
▽ More
Stereo vision is a growing topic in computer vision due to the innumerable opportunities and applications this technology offers for the development of modern solutions, such as virtual and augmented reality applications. To enhance the user's experience in three-dimensional virtual environments, the motion parallax estimation is a promising technique to achieve this objective. In this paper, we propose an algorithm for generating parallax motion effects from a single image, taking advantage of state-of-the-art instance segmentation and depth estimation approaches. This work also presents a comparison against such algorithms to investigate the trade-off between efficiency and quality of the parallax motion effects, taking into consideration a multi-task learning network capable of estimating instance segmentation and depth estimation at once. Experimental results and visual quality assessment indicate that the PyD-Net network (depth estimation) combined with Mask R-CNN or FBNet networks (instance segmentation) can produce parallax motion effects with good visual quality.
△ Less
Submitted 6 October, 2020;
originally announced October 2020.
-
Automated Neuron Shape Analysis from Electron Microscopy
Authors:
Sharmishtaa Seshamani,
Leila Elabbady,
Casey Schneider-Mizell,
Gayathri Mahalingam,
Sven Dorkenwald,
Agnes Bodor,
Thomas Macrina,
Daniel Bumbarger,
JoAnn Buchanan,
Marc Takeno,
Wen**g Yin,
Derrick Brittain,
Russel Torres,
Daniel Kapner,
Kisuk lee,
Ran Lu,
**peng Wu,
Nuno daCosta,
Clay Reid,
Forrest Collman
Abstract:
Morphology based analysis of cell types has been an area of great interest to the neuroscience community for several decades. Recently, high resolution electron microscopy (EM) datasets of the mouse brain have opened up opportunities for data analysis at a level of detail that was previously impossible. These datasets are very large in nature and thus, manual analysis is not a practical solution.…
▽ More
Morphology based analysis of cell types has been an area of great interest to the neuroscience community for several decades. Recently, high resolution electron microscopy (EM) datasets of the mouse brain have opened up opportunities for data analysis at a level of detail that was previously impossible. These datasets are very large in nature and thus, manual analysis is not a practical solution. Of particular interest are details to the level of post synaptic structures. This paper proposes a fully automated framework for analysis of post-synaptic structure based neuron analysis from EM data. The processing framework involves shape extraction, representation with an autoencoder, and whole cell modeling and analysis based on shape distributions. We apply our novel framework on a dataset of 1031 neurons obtained from imaging a 1mm x 1mm x 40 micrometer volume of the mouse visual cortex and show the strength of our method in clustering and classification of neuronal shapes.
△ Less
Submitted 29 May, 2020;
originally announced June 2020.
-
Multimodal Prediction based on Graph Representations
Authors:
Icaro Cavalcante Dourado,
Salvatore Tabbone,
Ricardo da Silva Torres
Abstract:
This paper proposes a learning model, based on rank-fusion graphs, for general applicability in multimodal prediction tasks, such as multimodal regression and image classification. Rank-fusion graphs encode information from multiple descriptors and retrieval models, thus being able to capture underlying relationships between modalities, samples, and the collection itself. The solution is based on…
▽ More
This paper proposes a learning model, based on rank-fusion graphs, for general applicability in multimodal prediction tasks, such as multimodal regression and image classification. Rank-fusion graphs encode information from multiple descriptors and retrieval models, thus being able to capture underlying relationships between modalities, samples, and the collection itself. The solution is based on the encoding of multiple ranks for a query (or test sample), defined according to different criteria, into a graph. Later, we project the generated graph into an induced vector space, creating fusion vectors, targeting broader generality and efficiency. A fusion vector estimator is then built to infer whether a multimodal input object refers to a class or not. Our method is capable of promoting a fusion model better than early-fusion and late-fusion alternatives. Performed experiments in the context of multiple multimodal and visual datasets, as well as several descriptors and retrieval models, demonstrate that our learning model is highly effective for different prediction scenarios involving visual, textual, and multimodal features, yielding better effectiveness than state-of-the-art methods.
△ Less
Submitted 3 July, 2020; v1 submitted 21 December, 2019;
originally announced December 2019.
-
Fusion vectors: Embedding Graph Fusions for Efficient Unsupervised Rank Aggregation
Authors:
Icaro Cavalcante Dourado,
Ricardo da Silva Torres
Abstract:
The vast increase in amount and complexity of digital content led to a wide interest in ad-hoc retrieval systems in recent years. Complementary, the existence of heterogeneous data sources and retrieval models stimulated the proliferation of increasingly ingenious and effective rank aggregation functions. Although recently proposed rank aggregation functions are promising with respect to effective…
▽ More
The vast increase in amount and complexity of digital content led to a wide interest in ad-hoc retrieval systems in recent years. Complementary, the existence of heterogeneous data sources and retrieval models stimulated the proliferation of increasingly ingenious and effective rank aggregation functions. Although recently proposed rank aggregation functions are promising with respect to effectiveness, existing proposals in the area usually overlook efficiency aspects. We propose an innovative rank aggregation function that is unsupervised, intrinsically multimodal, and targeted for fast retrieval and top effectiveness performance. We introduce the concepts of embedding and indexing of graph-based rank-aggregation representation models, and their application for search tasks. Embedding formulations are also proposed for graph-based rank representations. We introduce the concept of fusion vectors, a late-fusion representation of objects based on ranks, from which an intrinsically rank-aggregation retrieval model is defined. Next, we present an approach for fast retrieval based on fusion vectors, thus promoting an efficient rank aggregation system. Our method presents top effectiveness performance among state-of-the-art related work, while bringing novel aspects of multimodality and effectiveness. Consistent speedups are achieved against the recent baselines in all datasets considered.
△ Less
Submitted 1 July, 2019; v1 submitted 14 June, 2019;
originally announced June 2019.
-
Spatio-Temporal Vegetation Pixel Classification By Using Convolutional Networks
Authors:
Keiller Nogueira,
Jefersson A. dos Santos,
Nathalia Menini,
Thiago S. F. Silva,
Leonor Patricia C. Morellato,
Ricardo da S. Torres
Abstract:
Plant phenology studies rely on long-term monitoring of life cycles of plants. High-resolution unmanned aerial vehicles (UAVs) and near-surface technologies have been used for plant monitoring, demanding the creation of methods capable of locating and identifying plant species through time and space. However, this is a challenging task given the high volume of data, the constant data missing from…
▽ More
Plant phenology studies rely on long-term monitoring of life cycles of plants. High-resolution unmanned aerial vehicles (UAVs) and near-surface technologies have been used for plant monitoring, demanding the creation of methods capable of locating and identifying plant species through time and space. However, this is a challenging task given the high volume of data, the constant data missing from temporal dataset, the heterogeneity of temporal profiles, the variety of plant visual patterns, and the unclear definition of individuals' boundaries in plant communities. In this letter, we propose a novel method, suitable for phenological monitoring, based on Convolutional Networks (ConvNets) to perform spatio-temporal vegetation pixel-classification on high resolution images. We conducted a systematic evaluation using high-resolution vegetation image datasets associated with the Brazilian Cerrado biome. Experimental results show that the proposed approach is effective, overcoming other spatio-temporal pixel-classification strategies.
△ Less
Submitted 2 March, 2019;
originally announced March 2019.
-
$\ell_1$-sparsity Approximation Bounds for Packing Integer Programs
Authors:
Chandra Chekuri,
Kent Quanrud,
Manuel R. Torres
Abstract:
We consider approximation algorithms for packing integer programs (PIPs) of the form $\max\{\langle c, x\rangle : Ax \le b, x \in \{0,1\}^n\}$ where $c$, $A$, and $b$ are nonnegative. We let $W = \min_{i,j} b_i / A_{i,j}$ denote the width of $A$ which is at least $1$. Previous work by Bansal et al. \cite{bansal-sparse} obtained an $Ω(\frac{1}{Δ_0^{1/\lfloor W \rfloor}})$-approximation ratio where…
▽ More
We consider approximation algorithms for packing integer programs (PIPs) of the form $\max\{\langle c, x\rangle : Ax \le b, x \in \{0,1\}^n\}$ where $c$, $A$, and $b$ are nonnegative. We let $W = \min_{i,j} b_i / A_{i,j}$ denote the width of $A$ which is at least $1$. Previous work by Bansal et al. \cite{bansal-sparse} obtained an $Ω(\frac{1}{Δ_0^{1/\lfloor W \rfloor}})$-approximation ratio where $Δ_0$ is the maximum number of nonzeroes in any column of $A$ (in other words the $\ell_0$-column sparsity of $A$). They raised the question of obtaining approximation ratios based on the $\ell_1$-column sparsity of $A$ (denoted by $Δ_1$) which can be much smaller than $Δ_0$. Motivated by recent work on covering integer programs (CIPs) \cite{cq,chs-16} we show that simple algorithms based on randomized rounding followed by alteration, similar to those of Bansal et al. \cite{bansal-sparse} (but with a twist), yield approximation ratios for PIPs based on $Δ_1$. First, following an integrality gap example from \cite{bansal-sparse}, we observe that the case of $W=1$ is as hard as maximum independent set even when $Δ_1 \le 2$. In sharp contrast to this negative result, as soon as width is strictly larger than one, we obtain positive results via the natural LP relaxation. For PIPs with width $W = 1 + ε$ where $ε\in (0,1]$, we obtain an $Ω(ε^2/Δ_1)$-approximation. In the large width regime, when $W \ge 2$, we obtain an $Ω((\frac{1}{1 + Δ_1/W})^{1/(W-1)})$-approximation. We also obtain a $(1-ε)$-approximation when $W = Ω(\frac{\log (Δ_1/ε)}{ε^2})$.
△ Less
Submitted 22 February, 2019;
originally announced February 2019.
-
Unsupervised Graph-based Rank Aggregation for Improved Retrieval
Authors:
Icaro Cavalcante Dourado,
Daniel Carlos Guimarães Pedronette,
Ricardo da Silva Torres
Abstract:
This paper presents a robust and comprehensive graph-based rank aggregation approach, used to combine results of isolated ranker models in retrieval tasks. The method follows an unsupervised scheme, which is independent of how the isolated ranks are formulated. Our approach is able to combine arbitrary models, defined in terms of different ranking criteria, such as those based on textual, image or…
▽ More
This paper presents a robust and comprehensive graph-based rank aggregation approach, used to combine results of isolated ranker models in retrieval tasks. The method follows an unsupervised scheme, which is independent of how the isolated ranks are formulated. Our approach is able to combine arbitrary models, defined in terms of different ranking criteria, such as those based on textual, image or hybrid content representations.
We reformulate the ad-hoc retrieval problem as a document retrieval based on fusion graphs, which we propose as a new unified representation model capable of merging multiple ranks and expressing inter-relationships of retrieval results automatically. By doing so, we claim that the retrieval system can benefit from learning the manifold structure of datasets, thus leading to more effective results. Another contribution is that our graph-based aggregation formulation, unlike existing approaches, allows for encapsulating contextual information encoded from multiple ranks, which can be directly used for ranking, without further computations and post-processing steps over the graphs. Based on the graphs, a novel similarity retrieval score is formulated using an efficient computation of minimum common subgraphs. Finally, another benefit over existing approaches is the absence of hyperparameters.
A comprehensive experimental evaluation was conducted considering diverse well-known public datasets, composed of textual, image, and multimodal documents. Performed experiments demonstrate that our method reaches top performance, yielding better effectiveness scores than state-of-the-art baseline methods and promoting large gains over the rankers being fused, thus demonstrating the successful capability of the proposal in representing queries based on a unified graph-based model of rank fusions.
△ Less
Submitted 18 March, 2019; v1 submitted 17 January, 2019;
originally announced January 2019.
-
Survey of Bayesian Networks Applications to Intelligent Autonomous Vehicles
Authors:
Rocío Díaz de León Torres,
Martín Molina,
Pascual Campoy
Abstract:
This article reviews the applications of Bayesian Networks to Intelligent Autonomous Vehicles (IAV) from the decision making point of view, which represents the final step for fully Autonomous Vehicles (currently under discussion). Until now, when it comes making high level decisions for Autonomous Vehicles (AVs), humans have the last word. Based on the works cited in this article and analysis don…
▽ More
This article reviews the applications of Bayesian Networks to Intelligent Autonomous Vehicles (IAV) from the decision making point of view, which represents the final step for fully Autonomous Vehicles (currently under discussion). Until now, when it comes making high level decisions for Autonomous Vehicles (AVs), humans have the last word. Based on the works cited in this article and analysis done here, the modules of a general decision making framework and its variables are inferred. Many efforts have been made in the labs showing Bayesian Networks as a promising computer model for decision making. Further research should go into the direction of testing Bayesian Network models in real situations. In addition to the applications, Bayesian Network fundamentals are introduced as elements to consider when develo** IAVs with the potential of making high level judgement calls.
△ Less
Submitted 21 February, 2019; v1 submitted 16 January, 2019;
originally announced January 2019.
-
Link Prediction in Dynamic Graphs for Recommendation
Authors:
Samuel G. Fadel,
Ricardo da S. Torres
Abstract:
Recent advances in employing neural networks on graph domains helped push the state of the art in link prediction tasks, particularly in recommendation services. However, the use of temporal contextual information, often modeled as dynamic graphs that encode the evolution of user-item relationships over time, has been overlooked in link prediction problems. In this paper, we consider the hypothesi…
▽ More
Recent advances in employing neural networks on graph domains helped push the state of the art in link prediction tasks, particularly in recommendation services. However, the use of temporal contextual information, often modeled as dynamic graphs that encode the evolution of user-item relationships over time, has been overlooked in link prediction problems. In this paper, we consider the hypothesis that leveraging such information enables models to make better predictions, proposing a new neural network approach for this. Our experiments, performed on the widely used ML-100k and ML-1M datasets, show that our approach produces better predictions in scenarios where the pattern of user-item relationships change over time. In addition, they suggest that existing approaches are significantly impacted by those changes.
△ Less
Submitted 17 November, 2018;
originally announced November 2018.
-
Geometric Fingerprint Recognition via Oriented Point-Set Pattern Matching
Authors:
David Eppstein,
Michael T. Goodrich,
Jordan Jorgensen,
Manuel R. Torres
Abstract:
Motivated by the problem of fingerprint matching, we present geometric approximation algorithms for matching a pattern point set against a background point set, where the points have angular orientations in addition to their positions.
Motivated by the problem of fingerprint matching, we present geometric approximation algorithms for matching a pattern point set against a background point set, where the points have angular orientations in addition to their positions.
△ Less
Submitted 1 August, 2018;
originally announced August 2018.
-
A Genetic Algorithm Approach for ImageRepresentation Learning through Color Quantization
Authors:
Érico M. Pereira,
Ricardo da S. Torres,
Jefersson A. dos Santos
Abstract:
Over the last decades, hand-crafted feature extractors have been used to encode image visual properties into feature vectors. Recently, data-driven feature learning approaches have been successfully explored as alternatives for producing more representative visual features. In this work, we combine both research venues, focusing on the color quantization problem. We propose two data-driven approac…
▽ More
Over the last decades, hand-crafted feature extractors have been used to encode image visual properties into feature vectors. Recently, data-driven feature learning approaches have been successfully explored as alternatives for producing more representative visual features. In this work, we combine both research venues, focusing on the color quantization problem. We propose two data-driven approaches to learn image representations through the search for optimized quantization schemes, which lead to more effective feature extraction algorithms and compact representations. Our strategy employs Genetic Algorithm, a soft-computing apparatus successfully utilized in Information-retrieval-related optimization problems. We hypothesize that changing the quantization affects the quality of image description approaches, leading to effective and efficient representations. We evaluate our approaches in content-based image retrieval tasks, considering eight well-known datasets with different visual properties. Results indicate that the approach focused on representation effectiveness outperformed baselines in all tested scenarios. The other approach, which also considers the size of created representations, produced competitive results kee** or even reducing the dimensionality of feature vectors up to 25%.
△ Less
Submitted 20 November, 2020; v1 submitted 17 November, 2017;
originally announced November 2017.
-
Exploiting ConvNet Diversity for Flooding Identification
Authors:
Keiller Nogueira,
Samuel G. Fadel,
Ícaro C. Dourado,
Rafael de O. Werneck,
Javier A. V. Muñoz,
Otávio A. B. Penatti,
Rodrigo T. Calumby,
Lin Tzy Li,
Jefersson A. dos Santos,
Ricardo da S. Torres
Abstract:
Flooding is the world's most costly type of natural disaster in terms of both economic losses and human causalities. A first and essential procedure towards flood monitoring is based on identifying the area most vulnerable to flooding, which gives authorities relevant regions to focus. In this work, we propose several methods to perform flooding identification in high-resolution remote sensing ima…
▽ More
Flooding is the world's most costly type of natural disaster in terms of both economic losses and human causalities. A first and essential procedure towards flood monitoring is based on identifying the area most vulnerable to flooding, which gives authorities relevant regions to focus. In this work, we propose several methods to perform flooding identification in high-resolution remote sensing images using deep learning. Specifically, some proposed techniques are based upon unique networks, such as dilated and deconvolutional ones, while other was conceived to exploit diversity of distinct networks in order to extract the maximum performance of each classifier. Evaluation of the proposed algorithms were conducted in a high-resolution remote sensing dataset. Results show that the proposed algorithms outperformed several state-of-the-art baselines, providing improvements ranging from 1 to 4% in terms of the Jaccard Index.
△ Less
Submitted 5 June, 2018; v1 submitted 9 November, 2017;
originally announced November 2017.
-
An Exact Approach for the Balanced k-Way Partitioning Problem with Weight Constraints and its Application to Sports Team Realignment
Authors:
Diego Recalde,
Daniel Severín,
Ramiro Torres,
Polo Vaca
Abstract:
In this work a balanced k-way partitioning problem with weight constraints is defined to model the sports team realignment. Sports teams must be partitioned into a fixed number of groups according to some regulations, where the total distance of the road trips that all teams must travel to play a Double Round Robin Tournament in each group is minimized. Two integer programming formulations for thi…
▽ More
In this work a balanced k-way partitioning problem with weight constraints is defined to model the sports team realignment. Sports teams must be partitioned into a fixed number of groups according to some regulations, where the total distance of the road trips that all teams must travel to play a Double Round Robin Tournament in each group is minimized. Two integer programming formulations for this problem are introduced, and the validity of three families of inequalities associated to the polytope of these formulations is proved. The performance of a tabu search procedure and a Branch & Cut algorithm, which uses the valid inequalities as cuts, is evaluated over simulated and real-world instances. In particular, an optimal solution for the realignment of the Ecuadorian Football league is reported and the methodology can be suitable adapted for the realignment of other sports leagues.
△ Less
Submitted 5 September, 2017;
originally announced September 2017.
-
A Topological Algorithm for Determining How Road Networks Evolve Over Time
Authors:
M T Goodrich,
Siddharth Gupta,
Manuel R. Torres
Abstract:
We provide an efficient algorithm for determining how a road network has evolved over time, given two snapshot instances from different dates. To allow for such determinations across different databases and even against hand drawn maps, we take a strictly topological approach in this paper, so that we compare road networks based strictly on graph-theoretic properties. Given two road networks of sa…
▽ More
We provide an efficient algorithm for determining how a road network has evolved over time, given two snapshot instances from different dates. To allow for such determinations across different databases and even against hand drawn maps, we take a strictly topological approach in this paper, so that we compare road networks based strictly on graph-theoretic properties. Given two road networks of same region from two different dates, our approach allows one to match road network portions that remain intact and also point out added or removed portions. We analyze our algorithm both theoretically, showing that it runs in polynomial time for non-degenerate road networks even though a related problem is NP-complete, and experimentally, using dated road networks from the TIGER/Line archive of the U.S. Census Bureau.
△ Less
Submitted 23 September, 2016;
originally announced September 2016.
-
Learning from Imbalanced Multiclass Sequential Data Streams Using Dynamically Weighted Conditional Random Fields
Authors:
Roberto L. Shinmoto Torres,
Damith C. Ranasinghe,
Qinfeng Shi,
Anton van den Hengel
Abstract:
The present study introduces a method for improving the classification performance of imbalanced multiclass data streams from wireless body worn sensors. Data imbalance is an inherent problem in activity recognition caused by the irregular time distribution of activities, which are sequential and dependent on previous movements. We use conditional random fields (CRF), a graphical model for structu…
▽ More
The present study introduces a method for improving the classification performance of imbalanced multiclass data streams from wireless body worn sensors. Data imbalance is an inherent problem in activity recognition caused by the irregular time distribution of activities, which are sequential and dependent on previous movements. We use conditional random fields (CRF), a graphical model for structured classification, to take advantage of dependencies between activities in a sequence. However, CRFs do not consider the negative effects of class imbalance during training. We propose a class-wise dynamically weighted CRF (dWCRF) where weights are automatically determined during training by maximizing the expected overall F-score. Our results based on three case studies from a healthcare application using a batteryless body worn sensor, demonstrate that our method, in general, improves overall and minority class F-score when compared to other CRF based classifiers and achieves similar or better overall and class-wise performance when compared to SVM based classifiers under conditions of limited training data. We also confirm the performance of our approach using an additional battery powered body worn sensor dataset, achieving similar results in cases of high class imbalance.
△ Less
Submitted 11 March, 2016;
originally announced March 2016.
-
Semantic Diversity versus Visual Diversity in Visual Dictionaries
Authors:
Otávio A. B. Penatti,
Sandra Avila,
Eduardo Valle,
Ricardo da S. Torres
Abstract:
Visual dictionaries are a critical component for image classification/retrieval systems based on the bag-of-visual-words (BoVW) model. Dictionaries are usually learned without supervision from a training set of images sampled from the collection of interest. However, for large, general-purpose, dynamic image collections (e.g., the Web), obtaining a representative sample in terms of semantic concep…
▽ More
Visual dictionaries are a critical component for image classification/retrieval systems based on the bag-of-visual-words (BoVW) model. Dictionaries are usually learned without supervision from a training set of images sampled from the collection of interest. However, for large, general-purpose, dynamic image collections (e.g., the Web), obtaining a representative sample in terms of semantic concepts is not straightforward. In this paper, we evaluate the impact of semantics in the dictionary quality, aiming at verifying the importance of semantic diversity in relation visual diversity for visual dictionaries. In the experiments, we vary the amount of classes used for creating the dictionary and then compute different BoVW descriptors, using multiple codebook sizes and different coding and pooling methods (standard BoVW and Fisher Vectors). Results for image classification show that as visual dictionaries are based on low-level visual appearances, visual diversity is more important than semantic diversity. Our conclusions open the opportunity to alleviate the burden in generating visual dictionaries as we need only a visually diverse set of images instead of the whole collection to create a good dictionary.
△ Less
Submitted 20 November, 2015;
originally announced November 2015.
-
Approximate Similarity Search for Online Multimedia Services on Distributed CPU-GPU Platforms
Authors:
George Teodoro,
Eduardo Valle,
Nathan Mariano,
Ricardo Torres,
Wagner Meira Jr,
Joel H. Saltz
Abstract:
Similarity search in high-dimentional spaces is a pivotal operation found a variety of database applications. Recently, there has been an increase interest in similarity search for online content-based multimedia services. Those services, however, introduce new challenges with respect to the very large volumes of data that have to be indexed/searched, and the need to minimize response times observ…
▽ More
Similarity search in high-dimentional spaces is a pivotal operation found a variety of database applications. Recently, there has been an increase interest in similarity search for online content-based multimedia services. Those services, however, introduce new challenges with respect to the very large volumes of data that have to be indexed/searched, and the need to minimize response times observed by the end-users. Additionally, those users dynamically interact with the systems creating fluctuating query request rates, requiring the search algorithm to adapt in order to better utilize the underline hardware to reduce response times. In order to address these challenges, we introduce hypercurves, a flexible framework for answering approximate k-nearest neighbor (kNN) queries for very large multimedia databases, aiming at online content-based multimedia services. Hypercurves executes on hybrid CPU--GPU environments, and is able to employ those devices cooperatively to support massive query request rates. In order to keep the response times optimal as the request rates vary, it employs a novel dynamic scheduler to partition the work between CPU and GPU. Hypercurves was throughly evaluated using a large database of multimedia descriptors. Its cooperative CPU--GPU execution achieved performance improvements of up to 30x when compared to the single CPU-core version. The dynamic work partition mechanism reduces the observed query response times in about 50% when compared to the best static CPU--GPU task partition configuration. In addition, Hypercurves achieves superlinear scalability in distributed (multi-node) executions, while kee** a high guarantee of equivalence with its sequential version --- thanks to the proof of probabilistic equivalence, which supported its aggressive parallelization design.
△ Less
Submitted 3 September, 2012;
originally announced September 2012.
-
Are visual dictionaries generalizable?
Authors:
Otavio A. B. Penatti,
Eduardo Valle,
Ricardo da S. Torres
Abstract:
Mid-level features based on visual dictionaries are today a cornerstone of systems for classification and retrieval of images. Those state-of-the-art representations depend crucially on the choice of a codebook (visual dictionary), which is usually derived from the dataset. In general-purpose, dynamic image collections (e.g., the Web), one cannot have the entire collection in order to extract a re…
▽ More
Mid-level features based on visual dictionaries are today a cornerstone of systems for classification and retrieval of images. Those state-of-the-art representations depend crucially on the choice of a codebook (visual dictionary), which is usually derived from the dataset. In general-purpose, dynamic image collections (e.g., the Web), one cannot have the entire collection in order to extract a representative dictionary. However, based on the hypothesis that the dictionary reflects only the diversity of low-level appearances and does not capture semantics, we argue that a dictionary based on a small subset of the data, or even on an entirely different dataset, is able to produce a good representation, provided that the chosen images span a diverse enough portion of the low-level feature space. Our experiments confirm that hypothesis, opening the opportunity to greatly alleviate the burden in generating the codebook, and confirming the feasibility of employing visual dictionaries in large-scale dynamic environments.
△ Less
Submitted 11 May, 2012;
originally announced May 2012.
-
Bayesian approach for near-duplicate image detection
Authors:
Lucas Moutinho Bueno,
Eduardo Valle,
Ricardo da Silva Torres
Abstract:
In this paper we propose a bayesian approach for near-duplicate image detection, and investigate how different probabilistic models affect the performance obtained. The task of identifying an image whose metadata are missing is often demanded for a myriad of applications: metadata retrieval in cultural institutions, detection of copyright violations, investigation of latent cross-links in archives…
▽ More
In this paper we propose a bayesian approach for near-duplicate image detection, and investigate how different probabilistic models affect the performance obtained. The task of identifying an image whose metadata are missing is often demanded for a myriad of applications: metadata retrieval in cultural institutions, detection of copyright violations, investigation of latent cross-links in archives and libraries, duplicate elimination in storage management, etc. The majority of current solutions are based either on voting algorithms, which are very precise, but expensive; either on the use of visual dictionaries, which are efficient, but less precise. Our approach, uses local descriptors in a novel way, which by a careful application of decision theory, allows a very fine control of the compromise between precision and efficiency. In addition, the method attains a great compromise between those two axes, with more than 99% accuracy with less than 10 database operations.
△ Less
Submitted 25 April, 2011;
originally announced April 2011.