Showing 1–2 of 2 results for author: Standley, T

Search v0.5.6 released 2020-02-24

arXiv:2305.14352 [pdf, other]

cs.CV cs.LG

An Extensible Multimodal Multi-task Object Dataset with Materials

Authors: Trevor Standley, Ruohan Gao, Dawn Chen, Jiajun Wu, Silvio Savarese

Abstract: We present EMMa, an Extensible, Multimodal dataset of Amazon product listings that contains rich Material annotations. It contains more than 2.8 million objects, each with image(s), listing text, mass, price, product ratings, and position in Amazon's product-category taxonomy. We also design a comprehensive taxonomy of 182 physical materials (e.g., Plastic $\rightarrow$ Thermoplastic… ▽ More We present EMMa, an Extensible, Multimodal dataset of Amazon product listings that contains rich Material annotations. It contains more than 2.8 million objects, each with image(s), listing text, mass, price, product ratings, and position in Amazon's product-category taxonomy. We also design a comprehensive taxonomy of 182 physical materials (e.g., Plastic $\rightarrow$ Thermoplastic $\rightarrow$ Acrylic). Objects are annotated with one or more materials from this taxonomy. With the numerous attributes available for each object, we develop a Smart Labeling framework to quickly add new binary labels to all objects with very little manual labeling effort, making the dataset extensible. Each object attribute in our dataset can be included in either the model inputs or outputs, leading to combinatorial possibilities in task configurations. For example, we can train a model to predict the object category from the listing text, or the mass and price from the product listing image. EMMa offers a new benchmark for multi-task learning in computer vision and NLP, and allows practitioners to efficiently add new tasks and object attributes at scale. △ Less

Submitted 29 April, 2023; originally announced May 2023.

Comments: ICLR 2023
arXiv:1905.07553 [pdf, other]

cs.CV

Which Tasks Should Be Learned Together in Multi-task Learning?

Authors: Trevor Standley, Amir R. Zamir, Dawn Chen, Leonidas Guibas, Jitendra Malik, Silvio Savarese

Abstract: Many computer vision applications require solving multiple tasks in real-time. A neural network can be trained to solve multiple tasks simultaneously using multi-task learning. This can save computation at inference time as only a single network needs to be evaluated. Unfortunately, this often leads to inferior overall performance as task objectives can compete, which consequently poses the questi… ▽ More Many computer vision applications require solving multiple tasks in real-time. A neural network can be trained to solve multiple tasks simultaneously using multi-task learning. This can save computation at inference time as only a single network needs to be evaluated. Unfortunately, this often leads to inferior overall performance as task objectives can compete, which consequently poses the question: which tasks should and should not be learned together in one network when employing multi-task learning? We study task cooperation and competition in several different learning settings and propose a framework for assigning tasks to a few neural networks such that cooperating tasks are computed by the same neural network, while competing tasks are computed by different networks. Our framework offers a time-accuracy trade-off and can produce better accuracy using less inference time than not only a single large multi-task neural network but also many single-task networks. △ Less

Submitted 2 September, 2020; v1 submitted 18 May, 2019; originally announced May 2019.

Comments: Presented to ICML 2020 See project website at http://taskgrou**.stanford.edu/

Search v0.5.6 released 2020-02-24