Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator

Yamada, Makoto; Wu, Denny; Tsai, Yao-Hung Hubert; Takeuchi, Ichiro; Salakhutdinov, Ruslan; Fukumizu, Kenji

Statistics > Machine Learning

arXiv:1802.06226 (stat)

[Submitted on 17 Feb 2018]

Title:Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator

Authors:Makoto Yamada, Denny Wu, Yao-Hung Hubert Tsai, Ichiro Takeuchi, Ruslan Salakhutdinov, Kenji Fukumizu

View PDF

Abstract:Measuring divergence between two distributions is essential in machine learning and statistics and has various applications including binary classification, change point detection, and two-sample test. Furthermore, in the era of big data, designing divergence measure that is interpretable and can handle high-dimensional and complex data becomes extremely important. In the paper, we propose a post selection inference (PSI) framework for divergence measure, which can select a set of statistically significant features that discriminate two distributions. Specifically, we employ an additive variant of maximum mean discrepancy (MMD) for features and introduce a general hypothesis test for PSI. A novel MMD estimator using the incomplete U-statistics, which has an asymptotically Normal distribution (under mild assumptions) and gives high detection power in PSI, is also proposed and analyzed theoretically. Through synthetic and real-world feature selection experiments, we show that the proposed framework can successfully detect statistically significant features. Last, we propose a sample selection framework for analyzing different members in the Generative Adversarial Networks (GANs) family.

Subjects:	Machine Learning (stat.ML)
Cite as:	arXiv:1802.06226 [stat.ML]
	(or arXiv:1802.06226v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1802.06226

Submission history

From: Makoto Yamada [view email]
[v1] Sat, 17 Feb 2018 11:48:02 UTC (630 KB)

Statistics > Machine Learning

Title:Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators