MCMLSD: A Probabilistic Algorithm and Evaluation Framework for Line Segment Detection
Authors:
James H. Elder,
Emilio J. Almazàn,
Yiming Qian,
Ron Tal
Abstract:
Traditional approaches to line segment detection typically involve perceptual grou** in the image domain and/or global accumulation in the Hough domain. Here we propose a probabilistic algorithm that merges the advantages of both approaches. In a first stage lines are detected using a global probabilistic Hough approach. In the second stage each detected line is analyzed in the image domain to l…
▽ More
Traditional approaches to line segment detection typically involve perceptual grou** in the image domain and/or global accumulation in the Hough domain. Here we propose a probabilistic algorithm that merges the advantages of both approaches. In a first stage lines are detected using a global probabilistic Hough approach. In the second stage each detected line is analyzed in the image domain to localize the line segments that generated the peak in the Hough map. By limiting search to a line, the distribution of segments over the sequence of points on the line can be modeled as a Markov chain, and a probabilistically optimal labelling can be computed exactly using a standard dynamic programming algorithm, in linear time. The Markov assumption also leads to an intuitive ranking method that uses the local marginal posterior probabilities to estimate the expected number of correctly labelled points on a segment. To assess the resulting Markov Chain Marginal Line Segment Detector (MCMLSD) we develop and apply a novel quantitative evaluation methodology that controls for under- and over-segmentation. Evaluation on the YorkUrbanDB and Wireframe datasets shows that the proposed MCMLSD method outperforms prior traditional approaches, as well as more recent deep learning methods.
△ Less
Submitted 6 January, 2020;
originally announced January 2020.
Integration of Text-maps in Convolutional Neural Networks for Region Detection among Different Textual Categories
Authors:
Roberto Arroyo,
Javier Tovar,
Francisco J. Delgado,
Emilio J. Almazán,
Diego G. Serrador,
Antonio Hurtado
Abstract:
In this work, we propose a new technique that combines appearance and text in a Convolutional Neural Network (CNN), with the aim of detecting regions of different textual categories. We define a novel visual representation of the semantic meaning of text that allows a seamless integration in a standard CNN architecture. This representation, referred to as text-map, is integrated with the actual im…
▽ More
In this work, we propose a new technique that combines appearance and text in a Convolutional Neural Network (CNN), with the aim of detecting regions of different textual categories. We define a novel visual representation of the semantic meaning of text that allows a seamless integration in a standard CNN architecture. This representation, referred to as text-map, is integrated with the actual image to provide a much richer input to the network. Text-maps are colored with different intensities depending on the relevance of the words recognized over the image. Concretely, these words are previously extracted using Optical Character Recognition (OCR) and they are colored according to the probability of belonging to a textual category of interest. In this sense, this solution is especially relevant in the context of item coding for supermarket products, where different types of textual categories must be identified, such as ingredients or nutritional facts. We evaluated our solution in the proprietary item coding dataset of Nielsen Brandbank, which contains more than 10,000 images for train and 2,000 images for test. The reported results demonstrate that our approach focused on visual and textual data outperforms state-of-the-art algorithms only based on appearance, such as standard Faster R-CNN. These enhancements are reflected in precision and recall, which are improved in 42 and 33 points respectively.
△ Less
Submitted 26 May, 2019;
originally announced May 2019.