An information theoretic model for summarization, and some basic results

Graves, Eric; Ning, Qiang; Basu, Prithwish

Computer Science > Information Theory

arXiv:1901.06376 (cs)

[Submitted on 18 Jan 2019]

Title:An information theoretic model for summarization, and some basic results

Authors:Eric Graves, Qiang Ning, Prithwish Basu

View PDF

Abstract:A basic information theoretic model for summarization is formulated. Here summarization is considered as the process of taking a report of $v$ binary objects, and producing from it a $j$ element subset that captures most of the important features of the original report, with importance being defined via an arbitrary set function endemic to the model. The loss of information is then measured by a weight average of variational distances, which we term the semantic loss.
Our results include both cases where the probability distribution generating the $v$-length reports are known and unknown. In the case where it is known, our results demonstrate how to construct summarizers which minimize the semantic loss. For the case where the probability distribution is unknown, we show how to construct summarizers whose semantic loss when averaged uniformly over all possible distribution converges to the minimum.

Comments:	9 pages, 2 figures. Extended version of ISIT submission
Subjects:	Information Theory (cs.IT)
Cite as:	arXiv:1901.06376 [cs.IT]
	(or arXiv:1901.06376v1 [cs.IT] for this version)
	https://doi.org/10.48550/arXiv.1901.06376

Submission history

From: Eric Graves [view email]
[v1] Fri, 18 Jan 2019 18:38:40 UTC (26 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.IT

< prev | next >

new | recent | 2019-01

Change to browse by:

cs
math
math.IT

References & Citations

DBLP - CS Bibliography

listing | bibtex

Eric Graves
Qiang Ning
Prithwish Basu

export BibTeX citation

Computer Science > Information Theory

Title:An information theoretic model for summarization, and some basic results

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Theory

Title:An information theoretic model for summarization, and some basic results

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators