-
Multiple Imputation of Hierarchical Nonlinear Time Series Data with an Application to School Enrollment Data
Authors:
Daphne H. Liu,
Adrian E. Raftery
Abstract:
International comparisons of hierarchical time series data sets based on survey data, such as annual country-level estimates of school enrollment rates, can suffer from large amounts of missing data due to differing coverage of surveys across countries and across times. A popular approach to handling missing data in these settings is through multiple imputation, which can be especially effective w…
▽ More
International comparisons of hierarchical time series data sets based on survey data, such as annual country-level estimates of school enrollment rates, can suffer from large amounts of missing data due to differing coverage of surveys across countries and across times. A popular approach to handling missing data in these settings is through multiple imputation, which can be especially effective when there is an auxiliary variable that is strongly predictive of and has a smaller amount of missing data than the variable of interest. However, standard methods for multiple imputation of hierarchical time series data can perform poorly when the auxiliary variable and the variable of interest are have a nonlinear relationship. Performance of standard multiple imputation methods can also suffer if the substantive analysis model of interest is uncongenial to the imputation model, which can be a common occurrence for social science data if the imputation phase is conducted independently of the analysis phase. We propose a Bayesian method for multiple imputation of hierarchical nonlinear time series data that uses a sequential decomposition of the joint distribution and incorporates smoothing splines to account for nonlinear relationships between variables. We compare the proposed method with existing multiple imputation methods through a simulation study and an application to secondary school enrollment data. We find that the proposed method can lead to substantial performance increases for estimation of parameters in uncongenial analysis models and for prediction of individual missing values.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
Pyronia: Intra-Process Access Control for IoT Applications
Authors:
Marcela S. Melara,
David H. Liu,
Michael J. Freedman
Abstract:
Third-party code plays a critical role in IoT applications, which generate and analyze highly privacy-sensitive data. Unlike traditional desktop and server settings, IoT devices mostly run a dedicated, single application. As a result, vulnerabilities in third-party libraries within a process pose a much bigger threat than on traditional platforms.
We present Pyronia, a fine-grained access contro…
▽ More
Third-party code plays a critical role in IoT applications, which generate and analyze highly privacy-sensitive data. Unlike traditional desktop and server settings, IoT devices mostly run a dedicated, single application. As a result, vulnerabilities in third-party libraries within a process pose a much bigger threat than on traditional platforms.
We present Pyronia, a fine-grained access control system for IoT applications written in high-level languages. Pyronia exploits developers' coarse-grained expectations about how imported third-party code operates to restrict access to files, devices, and specific network destinations, at the granularity of individual functions. To efficiently protect such sensitive OS resources, Pyronia combines three techniques: system call interposition, stack inspection, and memory domains. This design avoids the need for application refactoring, or unintuitive data flow analysis, while enforcing the developer's access policy at run time. Our Pyronia prototype for Python runs on a custom Linux kernel, and incurs moderate performance overhead on unmodified Python applications.
△ Less
Submitted 20 November, 2019; v1 submitted 5 March, 2019;
originally announced March 2019.
-
Distributed Data-Processing Pipeline for Mingantu Ultrawide Spectral Radioheliograph
Authors:
F. Wang,
Y. Mei,
H. Deng,
C. Y. Liu,
D. H. Liu,
S. L. Wei,
W. Dai,
B. Liang,
Y. B. Liu,
X. L. Zhang,
K. F. Ji
Abstract:
The Chinese Spectral RadioHeliograph (CSRH) is a synthetic aperture radio interferometer built in Inner Mongolia, China. As a solar-dedicated interferometric array, CSRH is capable of producing high quality radio images at frequency range from 400 MHz to 15 GHz with high temporal, spatial, and spectral resolution.To implement high cadence imaging at wide-band and obtain more than 2 order higher mu…
▽ More
The Chinese Spectral RadioHeliograph (CSRH) is a synthetic aperture radio interferometer built in Inner Mongolia, China. As a solar-dedicated interferometric array, CSRH is capable of producing high quality radio images at frequency range from 400 MHz to 15 GHz with high temporal, spatial, and spectral resolution.To implement high cadence imaging at wide-band and obtain more than 2 order higher multiple frequencies, the implementation of the data processing system for CSRH is a great challenge. It is urgent to build a pipeline for processing massive data of CSRH generated every day. In this paper, we develop a high performance distributed data processing pipeline (DDPP) built on the OpenCluster infrastructure for processing CSRH observational data including data storage, archiving, preprocessing, image reconstruction, deconvolution, and real-time monitoring. We comprehensively elaborate the system architecture of the pipeline and the implementation of each subsystem. The DDPP is automatic, robust, scalable and manageable. The processing performance under multi computers parallel and GPU hybrid system meets the requirements of CSRH data processing. The study presents an valuable reference for other radio telescopes especially aperture synthesis telescopes, and also gives an valuable contribution to the current and/or future data intensive astronomical observations.
△ Less
Submitted 20 December, 2016;
originally announced December 2016.