Domain Adaptation under Missingness Shift

Zhou, Helen; Balakrishnan, Sivaraman; Lipton, Zachary C.

Computer Science > Machine Learning

arXiv:2211.02093 (cs)

[Submitted on 3 Nov 2022 (v1), last revised 3 May 2023 (this version, v3)]

Title:Domain Adaptation under Missingness Shift

Authors:Helen Zhou, Sivaraman Balakrishnan, Zachary C. Lipton

View PDF

Abstract:Rates of missing data often depend on record-kee** policies and thus may change across times and locations, even when the underlying features are comparatively stable. In this paper, we introduce the problem of Domain Adaptation under Missingness Shift (DAMS). Here, (labeled) source data and (unlabeled) target data would be exchangeable but for different missing data mechanisms. We show that if missing data indicators are available, DAMS reduces to covariate shift. Addressing cases where such indicators are absent, we establish the following theoretical results for underreporting completely at random: (i) covariate shift is violated (adaptation is required); (ii) the optimal linear source predictor can perform arbitrarily worse on the target domain than always predicting the mean; (iii) the optimal target predictor can be identified, even when the missingness rates themselves are not; and (iv) for linear models, a simple analytic adjustment yields consistent estimates of the optimal target parameters. In experiments on synthetic and semi-synthetic data, we demonstrate the promise of our methods when assumptions hold. Finally, we discuss a rich family of future extensions.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2211.02093 [cs.LG]
	(or arXiv:2211.02093v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2211.02093

Submission history

From: Helen Zhou [view email]
[v1] Thu, 3 Nov 2022 18:49:38 UTC (1,292 KB)
[v2] Wed, 1 Mar 2023 16:31:04 UTC (374 KB)
[v3] Wed, 3 May 2023 20:38:36 UTC (374 KB)

Computer Science > Machine Learning

Title:Domain Adaptation under Missingness Shift

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Domain Adaptation under Missingness Shift

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators