Quda: Natural Language Queries for Visual Data Analytics

Fu, Siwei; Xiong, Kai; Ge, Xiaodong; Tang, Siliang; Chen, Wei; Wu, Yingcai

Computer Science > Computation and Language

arXiv:2005.03257 (cs)

[Submitted on 7 May 2020 (v1), last revised 3 Dec 2020 (this version, v5)]

Title:Quda: Natural Language Queries for Visual Data Analytics

Authors:Siwei Fu, Kai Xiong, Xiaodong Ge, Siliang Tang, Wei Chen, Yingcai Wu

View PDF

Abstract:The identification of analytic tasks from free text is critical for visualization-oriented natural language interfaces (V-NLIs) to suggest effective visualizations. However, it is challenging due to the ambiguity and complexity nature of human language. To address this challenge, we present a new dataset, called Quda, that aims to help V-NLIs recognize analytic tasks from free-form natural language by training and evaluating cutting-edge multi-label classification models. Our dataset contains $14,035$ diverse user queries, and each is annotated with one or multiple analytic tasks. We achieve this goal by first gathering seed queries with data analysts and then employing extensive crowd force for paraphrase generation and validation. We demonstrate the usefulness of Quda through three applications. This work is the first attempt to construct a large-scale corpus for recognizing analytic tasks. With the release of Quda, we hope it will boost the research and development of V-NLIs in data analysis and visualization.

Subjects:	Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR)
Cite as:	arXiv:2005.03257 [cs.CL]
	(or arXiv:2005.03257v5 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2005.03257

Submission history

From: Xiaodong Ge [view email]
[v1] Thu, 7 May 2020 05:35:16 UTC (1,753 KB)
[v2] Wed, 13 May 2020 16:00:51 UTC (1,753 KB)
[v3] Thu, 20 Aug 2020 12:45:39 UTC (1 KB) (withdrawn)
[v4] Sun, 23 Aug 2020 07:34:50 UTC (1 KB) (withdrawn)
[v5] Thu, 3 Dec 2020 06:58:56 UTC (3,005 KB)

Computer Science > Computation and Language

Title:Quda: Natural Language Queries for Visual Data Analytics

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Quda: Natural Language Queries for Visual Data Analytics

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators