Computer Science > Computer Vision and Pattern Recognition
[Submitted on 26 May 2019]
Title:Integration of Text-maps in Convolutional Neural Networks for Region Detection among Different Textual Categories
View PDFAbstract:In this work, we propose a new technique that combines appearance and text in a Convolutional Neural Network (CNN), with the aim of detecting regions of different textual categories. We define a novel visual representation of the semantic meaning of text that allows a seamless integration in a standard CNN architecture. This representation, referred to as text-map, is integrated with the actual image to provide a much richer input to the network. Text-maps are colored with different intensities depending on the relevance of the words recognized over the image. Concretely, these words are previously extracted using Optical Character Recognition (OCR) and they are colored according to the probability of belonging to a textual category of interest. In this sense, this solution is especially relevant in the context of item coding for supermarket products, where different types of textual categories must be identified, such as ingredients or nutritional facts. We evaluated our solution in the proprietary item coding dataset of Nielsen Brandbank, which contains more than 10,000 images for train and 2,000 images for test. The reported results demonstrate that our approach focused on visual and textual data outperforms state-of-the-art algorithms only based on appearance, such as standard Faster R-CNN. These enhancements are reflected in precision and recall, which are improved in 42 and 33 points respectively.
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.