Localization of Fake News Detection via Multitask Transfer Learning

Cruz, Jan Christian Blaise; Tan, Julianne Agatha; Cheng, Charibeth

doi:10.13140/RG.2.2.23028.40322

Computer Science > Computation and Language

arXiv:1910.09295 (cs)

[Submitted on 21 Oct 2019 (v1), last revised 15 May 2020 (this version, v3)]

Title:Localization of Fake News Detection via Multitask Transfer Learning

Authors:Jan Christian Blaise Cruz, Julianne Agatha Tan, Charibeth Cheng

View PDF

Abstract:The use of the internet as a fast medium of spreading fake news reinforces the need for computational tools that combat it. Techniques that train fake news classifiers exist, but they all assume an abundance of resources including large labeled datasets and expert-curated corpora, which low-resource languages may not have. In this work, we make two main contributions: First, we alleviate resource scarcity by constructing the first expertly-curated benchmark dataset for fake news detection in Filipino, which we call "Fake News Filipino." Second, we benchmark Transfer Learning (TL) techniques and show that they can be used to train robust fake news classifiers from little data, achieving 91% accuracy on our fake news dataset, reducing the error by 14% compared to established few-shot baselines. Furthermore, lifting ideas from multitask learning, we show that augmenting transformer-based transfer techniques with auxiliary language modeling losses improves their performance by adapting to writing style. Using this, we improve TL performance by 4-6%, achieving an accuracy of 96% on our best model. Lastly, we show that our method generalizes well to different types of news articles, including political news, entertainment news, and opinion articles.

Comments:	Published in the LREC 2020 Proceedings. Models and data available at this https URL
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1910.09295 [cs.CL]
	(or arXiv:1910.09295v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1910.09295
Journal reference:	In Proceedings of The 12th Language Resources and Evaluation Conference, pp.2589-2597 (2020)
Related DOI:	https://doi.org/10.13140/RG.2.2.23028.40322

Submission history

From: Jan Christian Blaise Cruz [view email]
[v1] Mon, 21 Oct 2019 12:28:00 UTC (155 KB)
[v2] Wed, 30 Oct 2019 04:57:39 UTC (154 KB)
[v3] Fri, 15 May 2020 17:54:09 UTC (254 KB)

Computer Science > Computation and Language

Title:Localization of Fake News Detection via Multitask Transfer Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Localization of Fake News Detection via Multitask Transfer Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators