Category Enhanced Word Embedding

Zhou, Chunting; Sun, Chonglin; Liu, Zhiyuan; Lau, Francis C. M.

Computer Science > Computation and Language

arXiv:1511.08629 (cs)

[Submitted on 27 Nov 2015 (v1), last revised 30 Nov 2015 (this version, v2)]

Title:Category Enhanced Word Embedding

Authors:Chunting Zhou, Chonglin Sun, Zhiyuan Liu, Francis C.M. Lau

View PDF

Abstract:Distributed word representations have been demonstrated to be effective in capturing semantic and syntactic regularities. Unsupervised representation learning from large unlabeled corpora can learn similar representations for those words that present similar co-occurrence statistics. Besides local occurrence statistics, global topical information is also important knowledge that may help discriminate a word from another. In this paper, we incorporate category information of documents in the learning of word representations and to learn the proposed models in a document-wise manner. Our models outperform several state-of-the-art models in word analogy and word similarity tasks. Moreover, we evaluate the learned word vectors on sentiment analysis and text classification tasks, which shows the superiority of our learned word vectors. We also learn high-quality category embeddings that reflect topical meanings.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1511.08629 [cs.CL]
	(or arXiv:1511.08629v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1511.08629

Submission history

From: Chunting Zhou [view email]
[v1] Fri, 27 Nov 2015 11:38:57 UTC (122 KB)
[v2] Mon, 30 Nov 2015 07:33:09 UTC (122 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2015-11

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Chunting Zhou
Chonglin Sun
Zhiyuan Liu
Francis C. M. Lau

export BibTeX citation

Computer Science > Computation and Language

Title:Category Enhanced Word Embedding

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Category Enhanced Word Embedding

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators