Showing 1–2 of 2 results for author: Dankar, F K

Search v0.5.6 released 2020-02-24

arXiv:2304.06509 [pdf]

cs.CY

Practices and challenges in clinical data sharing

Authors: Fida K. Dankar

Abstract: The debate on data access and privacy is an ongoing one. It is kept alive by the never-ending changes/upgrades in (i) the shape of the data collected (in terms of size, diversity, sensitivity and quality), (ii) the laws governing data sharing, (iii) the amount of free public data available on individuals (social media, blogs, population-based databases, etc.), as well as (iv) the available privacy… ▽ More The debate on data access and privacy is an ongoing one. It is kept alive by the never-ending changes/upgrades in (i) the shape of the data collected (in terms of size, diversity, sensitivity and quality), (ii) the laws governing data sharing, (iii) the amount of free public data available on individuals (social media, blogs, population-based databases, etc.), as well as (iv) the available privacy enhancing technologies. This paper identifies current directions, challenges and best practices in constructing a clinical data-sharing framework for research purposes. Specifically, we create a taxonomy for the framework, identify the design choices available within each taxon, and demonstrate thew choices using current legal frameworks. The purpose is to devise best practices for the implementation of an effective, safe and transparent research access framework. △ Less

Submitted 17 March, 2023; originally announced April 2023.

MSC Class: 68 ACM Class: J.3
arXiv:2212.05595 [pdf]

cs.DB

A new PCA-based utility measure for synthetic data evaluation

Authors: F. K. Dankar, M. K. Ibrahim

Abstract: Data synthesis is a privacy enhancing technology aiming to produce realistic and timely data when real data is hard to obtain. Utility of synthetic data generators (SDGs) has been investigated through different utility metrics. These metrics have been found to generate conflicting conclusions making direct comparison of SDGs surprisingly difficult. Moreover, prior research found no correlation bet… ▽ More Data synthesis is a privacy enhancing technology aiming to produce realistic and timely data when real data is hard to obtain. Utility of synthetic data generators (SDGs) has been investigated through different utility metrics. These metrics have been found to generate conflicting conclusions making direct comparison of SDGs surprisingly difficult. Moreover, prior research found no correlation between popular metrics, concluding they tackle different utility-dimensions. This paper aggregates four popular utility metrics (representing different utility dimensions) into one using principal-component-analysis and checks whether the new measure can generate synthetic data that perform well in real-life. The new measure is used to compare four well-recognized SDGs. △ Less

Submitted 26 November, 2022; originally announced December 2022.

Comments: 20 pages, 5 figures, 8 tables, 1 appendix

MSC Class: 68Txx ACM Class: I.0

Search v0.5.6 released 2020-02-24