-
A systematic comparison of measures for k-anonymity in networks
Authors:
Rachel G. de Jong,
Mark P. J. van der Loo,
Frank W. Takes
Abstract:
Privacy-aware sharing of network data is a difficult task due to the interconnectedness of individuals in networks. An important part of this problem is the inherently difficult question of how in a particular situation the privacy of an individual node should be measured. To that end, in this paper we propose a set of aspects that one should consider when choosing a measure for privacy. These asp…
▽ More
Privacy-aware sharing of network data is a difficult task due to the interconnectedness of individuals in networks. An important part of this problem is the inherently difficult question of how in a particular situation the privacy of an individual node should be measured. To that end, in this paper we propose a set of aspects that one should consider when choosing a measure for privacy. These aspects include the type of desired privacy and attacker scenario against which the measure protects, utility of the data, the type of desired output, and the computational complexity of the chosen measure. Based on these aspects, we provide a systematic overview of existing approaches in the literature. We then focus on a set of measures that ultimately enables our objective: sharing the anonymized full network dataset with limited disclosure risk. The considered measures, each based on the concept of k-anonymity, account for the structure of the surroundings of a certain node and differ in completeness and reach of the structural information taken into account. We present a comprehensive theoretical characterization as well as comparative empirical experiments on a wide range of real-world network datasets with up to millions of edges. We find that the choice of the measure has an enormous effect on aforementioned aspects. Most interestingly, we find that the most effective measures consider a greater node vicinity, yet utilize minimal structural information and thus use minimal computational resources. This finding has important implications for researchers and practitioners, who may, based on the recommendations given in this paper, make an informed choice on how to safely share large-scale network data in a privacy-aware manner.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Reproducibility study of FairAC
Authors:
Gijs de Jong,
Macha J. Meijer,
Derck W. E. Prinzhorn,
Harold Ruiter
Abstract:
This work aims to reproduce the findings of the paper "Fair Attribute Completion on Graph with Missing Attributes" written by Guo, Chu, and Li arXiv:2302.12977 by investigating the claims made in the paper. This paper suggests that the results of the original paper are reproducible and thus, the claims hold. However, the claim that FairAC is a generic framework for many downstream tasks is very br…
▽ More
This work aims to reproduce the findings of the paper "Fair Attribute Completion on Graph with Missing Attributes" written by Guo, Chu, and Li arXiv:2302.12977 by investigating the claims made in the paper. This paper suggests that the results of the original paper are reproducible and thus, the claims hold. However, the claim that FairAC is a generic framework for many downstream tasks is very broad and could therefore only be partially tested. Moreover, we show that FairAC is generalizable to various datasets and sensitive attributes and show evidence that the improvement in group fairness of the FairAC framework does not come at the expense of individual fairness. Lastly, the codebase of FairAC has been refactored and is now easily applicable for various datasets and models.
△ Less
Submitted 10 June, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
Virtual reservoir acceleration for CPU and GPU: Case study for coupled spin-torque oscillator reservoir
Authors:
Thomas Geert de Jong,
Nozomi Akashi,
Tomohiro Taniguchi,
Hirofumi Notsu,
Kohei Nakajima
Abstract:
We provide high-speed implementations for simulating reservoirs described by $N$-coupled spin-torque oscillators. Here $N$ also corresponds to the number of reservoir nodes. We benchmark a variety of implementations based on CPU and GPU. Our new methods are at least 2.6 times quicker than the baseline for $N$ in range $1$ to $10^4$. More specifically, over all implementations the best factor is 78…
▽ More
We provide high-speed implementations for simulating reservoirs described by $N$-coupled spin-torque oscillators. Here $N$ also corresponds to the number of reservoir nodes. We benchmark a variety of implementations based on CPU and GPU. Our new methods are at least 2.6 times quicker than the baseline for $N$ in range $1$ to $10^4$. More specifically, over all implementations the best factor is 78.9 for $N=1$ which decreases to 2.6 for $N=10^3$ and finally increases to 23.8 for $N=10^4$. GPU outperforms CPU significantly at $N=2500$. Our results show that GPU implementations should be tested for reservoir simulations. The implementations considered here can be used for any reservoir with evolution that can be approximated using an explicit method.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Fully automated landmarking and facial segmentation on 3D photographs
Authors:
Bo Berends,
Freek Bielevelt,
Ruud Schreurs,
Shankeeth Vinayahalingam,
Thomas Maal,
Guido de Jong
Abstract:
Three-dimensional facial stereophotogrammetry provides a detailed representation of craniofacial soft tissue without the use of ionizing radiation. While manual annotation of landmarks serves as the current gold standard for cephalometric analysis, it is a time-consuming process and is prone to human error. The aim in this study was to develop and evaluate an automated cephalometric annotation met…
▽ More
Three-dimensional facial stereophotogrammetry provides a detailed representation of craniofacial soft tissue without the use of ionizing radiation. While manual annotation of landmarks serves as the current gold standard for cephalometric analysis, it is a time-consuming process and is prone to human error. The aim in this study was to develop and evaluate an automated cephalometric annotation method using a deep learning-based approach. Ten landmarks were manually annotated on 2897 3D facial photographs by a single observer. The automated landmarking workflow involved two successive DiffusionNet models and additional algorithms for facial segmentation. The dataset was randomly divided into a training and test dataset. The training dataset was used to train the deep learning networks, whereas the test dataset was used to evaluate the performance of the automated workflow. The precision of the workflow was evaluated by calculating the Euclidean distances between the automated and manual landmarks and compared to the intra-observer and inter-observer variability of manual annotation and the semi-automated landmarking method. The workflow was successful in 98.6% of all test cases. The deep learning-based landmarking method achieved precise and consistent landmark annotation. The mean precision of 1.69 (+/-1.15) mm was comparable to the inter-observer variability (1.31 +/-0.91 mm) of manual annotation. The Euclidean distance between the automated and manual landmarks was within 2 mm in 69%. Automated landmark annotation on 3D photographs was achieved with the DiffusionNet-based approach. The proposed method allows quantitative analysis of large datasets and may be used in diagnosis, follow-up, and virtual surgical planning.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
The effect of distant connections on node anonymity in complex networks
Authors:
Rachel G. de Jong,
Mark P. J. van der Loo,
Frank W. Takes
Abstract:
Ensuring privacy of individuals is of paramount importance to social network analysis research. Previous work assessed anonymity in a network based on the non-uniqueness of a node's ego network. In this work, we show that this approach does not adequately account for the strong de-anonymizing effect of distant connections. We first propose the use of d-k-anonymity, a novel measure that takes knowl…
▽ More
Ensuring privacy of individuals is of paramount importance to social network analysis research. Previous work assessed anonymity in a network based on the non-uniqueness of a node's ego network. In this work, we show that this approach does not adequately account for the strong de-anonymizing effect of distant connections. We first propose the use of d-k-anonymity, a novel measure that takes knowledge up to distance d of a considered node into account. Second, we introduce anonymity-cascade, which exploits the so-called infectiousness of uniqueness: mere information about being connected to another unique node can make a given node uniquely identifiable. These two approaches, together with relevant "twin node" processing steps in the underlying graph structure, offer practitioners flexible solutions, tunable in precision and computation time. This enables the assessment of anonymity in large-scale networks with up to millions of nodes and edges. Experiments on graph models and a wide range of real-world networks show drastic decreases in anonymity when connections at distance 2 are considered. Moreover, extending the knowledge beyond the ego network with just one extra link often already decreases overall anonymity by over 50%. These findings have important implications for privacy-aware sharing of sensitive network data.
△ Less
Submitted 14 November, 2023; v1 submitted 23 June, 2023;
originally announced June 2023.
-
How neural networks learn to classify chaotic time series
Authors:
Alessandro Corbetta,
Thomas Geert de Jong
Abstract:
Neural networks are increasingly employed to model, analyze and control non-linear dynamical systems ranging from physics to biology. Owing to their universal approximation capabilities, they regularly outperform state-of-the-art model-driven methods in terms of accuracy, computational speed, and/or control capabilities. On the other hand, neural networks are very often they are taken as black box…
▽ More
Neural networks are increasingly employed to model, analyze and control non-linear dynamical systems ranging from physics to biology. Owing to their universal approximation capabilities, they regularly outperform state-of-the-art model-driven methods in terms of accuracy, computational speed, and/or control capabilities. On the other hand, neural networks are very often they are taken as black boxes whose explainability is challenged, among others, by huge amounts of trainable parameters. In this paper, we tackle the outstanding issue of analyzing the inner workings of neural networks trained to classify regular-versus-chaotic time series. This setting, well-studied in dynamical systems, enables thorough formal analyses. We focus specifically on a family of networks dubbed Large Kernel Convolutional Neural Networks (LKCNN), recently introduced by Boullé et al. (2021). These non-recursive networks have been shown to outperform other established architectures (e.g. residual networks, shallow neural networks and fully convolutional networks) at this classification task. Furthermore, they outperform ``manual'' classification approaches based on direct reconstruction of the Lyapunov exponent. We find that LKCNNs use qualitative properties of the input sequence. In particular, we show that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models. Low performing models show, in fact, analogous periodic activations to random untrained models. This could give very general criteria for identifying, a priori, trained models that have poor accuracy.
△ Less
Submitted 4 June, 2023;
originally announced June 2023.
-
On Constructing the Value Function for Optimal Trajectory Problem and its Application to Image Processing
Authors:
Myong-Song Ho,
Gwang-Hui Ju,
Yong-Bom O,
Gwang-Ho Jong
Abstract:
We proposed an algorithm for solving Hamilton-Jacobi equation associated to an optimal trajectory problem for a vehicle moving inside the pre-specified domain with the speed depending upon the direction of the motion and current position of the vehicle. The dynamics of the vehicle is defined by an ordinary differential equation, the right hand of which is given by product of control(a time depende…
▽ More
We proposed an algorithm for solving Hamilton-Jacobi equation associated to an optimal trajectory problem for a vehicle moving inside the pre-specified domain with the speed depending upon the direction of the motion and current position of the vehicle. The dynamics of the vehicle is defined by an ordinary differential equation, the right hand of which is given by product of control(a time dependent fuction) and a function dependent on trajectory and control. At some unspecified terminal time, the vehicle reaches the boundary of the pre-specified domain and incurs a terminal cost. We also associate the traveling cost with a type of integral to the trajectory followed by vehicle. We are interested in a numerical method for finding a trajectory that minimizes the sum of the traveling cost and terminal cost. We developed an algorithm solving the value function for general trajectory optimization problem. Our algorithm is closely related to the Tsitsiklis's Fast Marching Method and J. A. Sethian's OUM and SLF-LLL[1-4] and is a generalization of them. On the basis of these results, We applied our algorithm to the image processing such as fingerprint verification.
△ Less
Submitted 27 March, 2013; v1 submitted 20 March, 2013;
originally announced March 2013.