TY - JOUR
T1 - DUEF-GA
T2 - data utility and privacy evaluation framework for graph anonymization
AU - Casas-Roma, Jordi
N1 - Publisher Copyright:
© 2019, Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2020/8/1
Y1 - 2020/8/1
N2 - Anonymization of graph-based data is a problem which has been widely studied over the last years, and several anonymization methods have been developed. Information loss measures have been used to evaluate data utility and information loss in the anonymized graphs. However, there is no consensus about how to evaluate data utility and information loss in privacy-preserving and anonymization scenarios, where the anonymous datasets were perturbed to hinder re-identification processes. Authors use diverse metrics to evaluate data utility and, consequently, it is complex to compare different methods or algorithms in the literature. In this paper, we propose a framework to evaluate and compare anonymous datasets in a common way, providing an objective score to clearly compare methods and algorithms. Our framework includes metrics based on generic information loss measures, such as average distance or betweenness centrality and also task-specific information loss measures, such as community detection or information flow. Additionally, we provide some metrics to examine re-identification and risk assessment. We demonstrate that our framework could help researchers and practitioners to select the best parametrization and/or algorithm to reduce information loss and maximize data utility.
AB - Anonymization of graph-based data is a problem which has been widely studied over the last years, and several anonymization methods have been developed. Information loss measures have been used to evaluate data utility and information loss in the anonymized graphs. However, there is no consensus about how to evaluate data utility and information loss in privacy-preserving and anonymization scenarios, where the anonymous datasets were perturbed to hinder re-identification processes. Authors use diverse metrics to evaluate data utility and, consequently, it is complex to compare different methods or algorithms in the literature. In this paper, we propose a framework to evaluate and compare anonymous datasets in a common way, providing an objective score to clearly compare methods and algorithms. Our framework includes metrics based on generic information loss measures, such as average distance or betweenness centrality and also task-specific information loss measures, such as community detection or information flow. Additionally, we provide some metrics to examine re-identification and risk assessment. We demonstrate that our framework could help researchers and practitioners to select the best parametrization and/or algorithm to reduce information loss and maximize data utility.
KW - Anonymity
KW - Data utility
KW - Evaluation framework
KW - Graphs
KW - Privacy-preserving
KW - Social networks
UR - http://www.scopus.com/inward/record.url?scp=85073940894&partnerID=8YFLogxK
U2 - 10.1007/s10207-019-00469-4
DO - 10.1007/s10207-019-00469-4
M3 - Article
AN - SCOPUS:85073940894
SN - 1615-5262
VL - 19
SP - 465
EP - 478
JO - International Journal of Information Security
JF - International Journal of Information Security
IS - 4
ER -