Softmax Is Not an Artificial Trick: An Information-Theoretic View of Softmax in Neural Networks

Qin, Zhenyue; Kim, Dongwoo

Computer Science > Machine Learning

arXiv:1910.02629 (cs)

This paper has been withdrawn by Zhenyue Qin

[Submitted on 7 Oct 2019 (v1), last revised 15 Oct 2019 (this version, v3)]

Title:Softmax Is Not an Artificial Trick: An Information-Theoretic View of Softmax in Neural Networks

Authors:Zhenyue Qin, Dongwoo Kim

No PDF available, click to view other formats

Abstract:Despite great popularity of applying softmax to map the non-normalised outputs of a neural network to a probability distribution over predicting classes, this normalised exponential transformation still seems to be artificial. A theoretic framework that incorporates softmax as an intrinsic component is still lacking. In this paper, we view neural networks embedding softmax from an information-theoretic perspective. Under this view, we can naturally and mathematically derive log-softmax as an inherent component in a neural network for evaluating the conditional mutual information between network output vectors and labels given an input datum. We show that training deterministic neural networks through maximising log-softmax is equivalent to enlarging the conditional mutual information, i.e., feeding label information into network outputs. We also generalise our informative-theoretic perspective to neural networks with stochasticity and derive information upper and lower bounds of log-softmax. In theory, such an information-theoretic view offers rationality support for embedding softmax in neural networks; in practice, we eventually demonstrate a computer vision application example of how to employ our information-theoretic view to filter out targeted objects on images.

Comments:	Withdrawn due to Zhenyue Qin uploading the manuscript without consent of the other authors
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:1910.02629 [cs.LG]
	(or arXiv:1910.02629v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1910.02629

Submission history

From: Zhenyue Qin [view email]
[v1] Mon, 7 Oct 2019 06:46:06 UTC (2,912 KB)
[v2] Mon, 14 Oct 2019 00:34:56 UTC (1 KB) (withdrawn)
[v3] Tue, 15 Oct 2019 05:59:37 UTC (1 KB) (withdrawn)

Computer Science > Machine Learning

Title:Softmax Is Not an Artificial Trick: An Information-Theoretic View of Softmax in Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Softmax Is Not an Artificial Trick: An Information-Theoretic View of Softmax in Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators