Comparative Study of Graph-Based Privacy Preservation Techniques

Mariam RAMDI; Oumaima LOUZAR; Ouafae BAIDA; Abdelouahid LYHYAOUI

doi:https://doi.org/10.14445/22315381/IJETT-V72I6P134

Research Article | Open Access | Download PDF

Volume 72 | Issue 6 | Year 2024 | Article Id. IJETT-V72I6P134 | DOI : https://doi.org/10.14445/22315381/IJETT-V72I6P134

Comparative Study of Graph-Based Privacy Preservation Techniques

Mariam RAMDI, Oumaima LOUZAR, Ouafae BAIDA, Abdelouahid LYHYAOUI

Received	Revised	Accepted	Published
19 Feb 2024	14 May 2024	29 May 2024	29 Jun 2024

Citation :

Mariam RAMDI, Oumaima LOUZAR, Ouafae BAIDA, Abdelouahid LYHYAOUI, "Comparative Study of Graph-Based Privacy Preservation Techniques," International Journal of Engineering Trends and Technology (IJETT), vol. 72, no. 6, pp. 380-396, 2024. Crossref, https://doi.org/10.14445/22315381/IJETT-V72I6P134

Abstract

Social networks have emerged as subjects of investigation across numerous fields, including sociology, epidemiology, and viral marketing. Analyzing certain structural properties of a social graph, such as node degree or graph diameter, allows for the inference of information about the individuals comprising the network. An effective approach to anonymize a graph involves generalizing specific groups of nodes into supernodes and collapsing multiple links into meta-links. However, it is important to note that this anonymization method may significantly impact the resulting utility derived from the generalized graph. Various research efforts have proposed techniques to anonymize social networks, but the central challenge in this domain lies in achieving a useful final graph with minimal information loss that can be tailored to meet diverse requirements. This article presents a detailed comparative study that elucidates the strengths and weaknesses of different existing techniques found in the literature.

Keywords

Anonymization, Social networks, Graph modification, Differential privacy, Generalization.

References

[1] Benjamin C.M. Fung et al., Introduction to Privacy-Preserving Data Publishing, 1st ed., Chapman and Hall/CRC, pp. 1-376, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Latanya Sweeney, “K-Anonymity: A Model For Protecting Privacy,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 10, no. 5, pp. 557-570, 2002.
[CrossRef] [Google Scholar] [Publisher Link]
[3] M. Prakash, and G. Singaravel, “A New Model for Privacy Preserving Sensitive Data Mining,” 2012 Third International Conference on Computing, Communication and Networking Technologies, Coimbatore, India, pp. 1-8, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Mehmet Ercan Nergiz, and Chris Clifton, “δ-Presence without Complete World Knowledge,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 6, pp. 868-883, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Ashwin Machanavajjhala et al., “L-Diversity: Privacy Beyond K-Anonymity,” ACM Transactions on Knowledge Discovery from Data, vol. 1, no. 1, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Yang Xu et al., “A Survey of Privacy Preserving Data Publishing Using Generalization and Suppression,” Applied Mathematics & Information Sciences, vol. 8, no. 3, pp. 1103-1116, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Ruth Brand, Microdata Protection through Noise Addition, Inference Control in Statistical Databases, Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, vol. 2316, pp. 97-116, 2002.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Gregory S. Nelson, “Practical Implications of Sharing Data: A Primer on Data Privacy, Anonymization, and De-Identification,” SAS Global Forum Proceedings, pp. 1-23, 2015.
[Google Scholar]
[9] Stephen E. Fienberg, and Julie McIntyre, “Data Swapping: Variations on a Theme by Dalenius and Reiss,” Privacy in Statistical Databases: CASC Project Final Conference, Barcelona, Spain, pp. 14-29, 2005.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Sean J. Hickman, and Youyi Mao, “Method and System for Data Pattern Matching, Masking and Removal of Sensitive Data,” US20130167192A1, pp. 1-14, 2013.
[Google Scholar] [Publisher Link]
[11] Joana Ferreira Marques, and Jorge Bernardino, “Analysis of Data Anonymization Techniques,” Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, vol. 2, pp. 235-241, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[12] B.K. Tripathy et al., Privacy and Anonymization in Social Networks, Social Networking, Intelligent Systems Reference Library, Springer, Cham, vol. 65, pp. 243-270, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Pierangela Samarati, and Latanya Sweeney, “Protecting Privacy when Disclosing Information: K-Anonymity and its Enforcement through Generalization and Suppression,” 1998.
[Google Scholar] [Publisher Link]
[14] Réka Albert, Hawoong Jeong, and Albert-László Barabási, “Error and Attack Tolerance of Complex Networks,” Letters, vol. 406, no. 6794, pp. 378-382, 2000.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Duncan J. Watts, and Steven H. Strogatz, “Collective Dynamics of ‘Small-World’ Networks,” Letters, vol. 393, no. 6684, pp. 440-442, 1998.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Michael Hay et al., “Resisting Structural Re-Identification in Anonymized Social Networks,” Proceedings of the VLDB Endowment, vol. 1, no. 1, pp. 102-114, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Michael Hay et al., “Anonymizing Social Networks,” University of Massachusetts Amherst, pp. 1-17, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Kun J Ray Liu, and Evimaria Terzi, “Towards Identity Anonymization on Graphs,” Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Vancouver Canada, pp. 93-106, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Xiaowei Ying, and Xintao Wu, “Randomizing Social Networks: A Spectrum Preserving Approach,” Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 739-750, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Lian Liu et al., “Privacy Preserving in Social Networks Against Sensitive Edge Disclosure,” Technical Report CMIDA-HiPSCCS 006-08, Department of Computer Science, University of Kentucky, 2008.
[Google Scholar]
[21] Elena Zheleva, and Lise Getoor, “Preserving the Privacy of Sensitive Relationships in Graph Data,” Privacy, Security, and Trust in KDD, First ACM SIGKDD International Workshop, PinKDD, San Jose, CA, USA, pp. 153-171, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Alina Campan, and Traian Marius Truta, “A Clustering Approach for Data and Structural Anonymity,” Proceedings of the 2nd ACM SIGKDD International Workshop on Privacy, Security, and Trust in KDD (PinKDD’08), in Conjunction with KDD, vol. 8, pp. 1-10, 2008.
[Google Scholar] [Publisher Link]
[23] Bin Zhou, and Jian Pei, “Preserving Privacy in Social Networks against Neighborhood Attacks,” 2008 IEEE 24th International Conference on Data Engineering, Cancun, Mexico, pp. 506-515, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Bin Zhou, and Jian Pei, “The K-Anonymity and L-Diversity Approaches for Privacy Preservation in Social Networks against Neighborhood Attacks,” Knowledge and Information Systems, vol. 28, no. 1, pp. 47-77, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Graham Cormode et al., “Anonymizing Bipartite Graph Data Using Safe Groupings,” Proceedings of the VLDB Endowment, vol. 1, no. 1, pp. 833-844, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Bin Zhou, Jian Pei, and WoShun Luk, “A Brief Survey on Anonymization Techniques for Privacy Preserving Publishing of Social Network Data,” ACM SIGKDD Explorations Newsletter, vol. 10, no. 2, pp. 12-22, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Mohammad Soryani, and Behrooz Minaei, “Social Networks Research Aspects: A Vast and Fast Survey Focused on the Issue of Privacy in Social Network Sites,” arXiv, pp. 363-373, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Kristen Riedt Lefevre, David Johns DeWitt, and Raghu Ramakrishnan, “Incognito: Efficient Full-Domain K-Anonymity,” Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore Maryland, pp. 49-60, 2005.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Qin Liu et al., “Preserving Privacy with Probabilistic Indistinguishability in Weighted Social Networks,” IEEE Transactions on Parallel and Distributed Systems, vol. 28, no. 5, pp. 1417-1429, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Gary William Flake, Robert E. Tarjan, and Kostas Tsioutsiouliklis, “Graph Clustering and Minimum Cut Trees,” Internet Mathematics, vol. 1, no. 4, pp. 385-408, 2004.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Tinghuai Ma et al., “KDVEM: a K-Degree Anonymity with Vertex and Edge Modification Algorithm,” Computing, vol. 97, no. 12, pp. 1165-1184, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Lars Backstrom, Cynthia Dwork, and Jon Michael Kleinberg, “Wherefore Art Thou r3579x?: Anonymized Social Networks, Hidden Patterns, and Structural Steganography,” Proceedings of the 16th International Conference on World Wide Web, Banff Alberta Canada, pp. 181-190, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Fionn Murtagh, and Pedro Contreras, “Algorithms for Hierarchical Clustering: An Overview,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 2, no. 1, pp. 86-97, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Aristidis Likas, Nikos Vlassis, and Jakob J. Verbeek, “The Global K-Means Clustering Algorithm,” Pattern Recognition, vol. 36, no. 2, pp. 451-461, 2003.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Michael Hay et al., “Accurate Estimation of the Degree Distribution of Private Networks,” 2009 Ninth IEEE International Conference on Data Mining, Miami Beach, FL, USA, pp. 169-178, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Shiva Prasad Kasiviswanathan et al., “Analyzing Graphs with Node Differential Privacy,” Theory of Cryptography, 10th Theory of Cryptography Conference, Tokyo, Japan, pp. 457-476, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith, “Smooth Sensitivity and Sampling in Private Data Analysis,” Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, San Diego, California USA, pp. 75-84, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Francesco Bonchi, Aristides Gionis, and Tamir Tassa, “Identity Obfuscation in Graphs through the Information Theoretic Lens,” Information Sciences, vol. 275, pp. 232-256, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Alexandre Evfimievski, Johannes Gehrke, and Ramakrishnan Srikant, “Limiting Privacy Breaches in Privacy Preserving Data Mining,” Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, San Diego California, pp. 211-222, 2003.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Rakesh Agrawal, Ramakrishnan Srikant, and Dilys Thomas, “Privacy Preserving Olap,” Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore Maryland, pp. 251-262, 2005.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Yufei Tao et al., “On Anti-Corruption Privacy Preserving Publication,” 2008 IEEE 24th International Conference on Data Engineering, Cancun, Mexico, pp. 725-734, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Xuesong Lu, Yi Song, and Stéphane Bressan, “Fast Identity Anonymization on Graphs,” Database and Expert Systems Applications: 23rd International Conference, Vienna, Austria, pp. 281-295, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[43] Ting Yu, and Sushil Jajodia, Secure Data Management in Decentralized Systems, Advances in Information Security, Springer New York, NY, vol. 33, pp. 323-353, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[44] Vicenç Torra, Guillermo Navarro-Arribas, and Klara Stokes, An Overview of the Use of Clustering for Data Privacy, Unsupervised Learning Algorithms, Springer, Cham, pp. 237-251, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[45] Sean Chester et al., “K-Anonymization of Social Networks by Vertex Addition,” ADBIS 2011, Research Communications, Proceedings II of the 15th East-European Conference on Advances in Databases and Information Systems, Vienna, Austria, vol. 789, pp. 107-116, 2011.
[Google Scholar] [Publisher Link]
[46] Chongjing Sun et al., “Privacy Preserving Social Network Publication Against Mutual Friend Attacks,” 2013 IEEE 13th International Conference on Data Mining Workshops, Dallas, TX, USA, pp. 883-890, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[47] Michael Hay et al., “Resisting Structural Re-Identification in Anonymized Social Networks,” The VLDB Journal, vol. 19, pp. 797-823, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[48] Faraz Ahmed, Alex X. Liu, and Rong Jin, “Publishing Social Network Graph Eigenspectrum with Privacy Guarantees,” IEEE Transactions on Network Science and Engineering, vol. 7, no. 2, pp. 892-906, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[49] Miguel Ros-Martín, Julián Salas, and Jordi Casas-Roma, “Scalable Non- Deterministic Clustering-Based K-Anonymization for Rich Networks,” International Journal of Information Security, vol. 18, pp. 219-238, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[50] Xiaowei Ying et al., “Comparisons of Randomization and K-Degree Anonymization Schemes for Privacy Preserving Social Network Publishing,” Proceedings of the 3rd Workshop on Social Network Mining and Analysis, Paris, France, pp. 1-10, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[51] Jordi Casas-Roma, “Privacy-Preserving on Graphs Using Randomization and Edge-Relevance,” 11th International Conference Modeling Decisions for Artificial Intelligence, Tokyo, Japan, pp. 204-216, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[52] Tianchong Gao, and Feng Li, “Sharing Social Networks Using a Novel Differentially Private Graph Model,” 2019 16th IEEE Annual Consumer Communications & Networking Conference, Las Vegas, NV, USA, pp. 1-4, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[53] Peng Liu et al., “Local Differential Privacy for Social Network Publishing,” Neurocomputing, vol. 391, pp. 273-279, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[54] Tianchong Gao et al., “Local Differential Privately Anonymizing Online Social Networks Under HRG-Based Model,” IEEE Transactions on Computational Social Systems, vol. 5, no. 4, pp. 1009-1020, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[55] Sepp Hartung, Clemens Hoffmann, and André Nichterlein, “Improved Upper and Lower Bound Heuristics for Degree Anonymization in Social Networks,” 13th International Symposium on Experimental Algorithms, Copenhagen, Denmark, pp. 376-387, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[56] Seyedhashem Hamzehzadeh, and Sayyed Majid Mazinani, “ANNM: A New Method for Adding Noise Nodes which are Used Recently in Anonymization Methods in Social Networks,” Wireless Personal Communications, vol. 107, no. 4, pp. 1995-2017, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[57] Tianchong Gao, and Feng Li, “Privacy-Preserving Sketching for Online Social Network Data Publication,” 2019 16th Annual IEEE International Conference on Sensing, Communication, and Networking, Boston, MA, USA, pp. 1-9, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[58] Ratandeep Kaur, Manisha Sharma, and S. Taruna, “Ensemble Based Model for Privacy in Online Social Network,” Proceedings of International Conference on Sustainable Computing in Science, Technology and Management, pp. 1277-1282, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[59] Arvind Narayanan, and Vitaly Shmatikov, “Robust De-Anonymization of Large Sparse Datasets,” 2008 IEEE Symposium on Security and Privacy, Oakland, CA, USA, pp. 111-125, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[60] Alina Campan, and Traian Marius Truta, “Data and Structural K- Anonymity in Social Networks,” Privacy, Security, and Trust in KDD: Second ACM SIGKDD International Workshop, Las Vegas, Nevada, pp. 33-54, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[61] Zhipeng Cai et al., “Collective Data-Sanitization for Preventing Sensitive Information Inference Attacks in Social Networks,” IEEE Transactions on Dependable and Secure Computing, vol. 15, no. 4, pp. 577-590, 2016.
[CrossRef] [Google Scholar] [Publisher Link]