Source:
Workshop on Privacy in the Electronic Society (WPES), ACM, Dallas (2017)
Abstract:
It is important to study the risks of publishing privacy-sensitive data. Even if sensitive identities (e.g., name, social security number) were removed and advanced data perturbation techniques were applied, several de-anonymization attacks have been proposed to re-identify individuals. However, existing attacks have some limitations: 1) they are limited in de-anonymization accuracy; 2) they require prior seed knowledge and suffer from the imprecision of such seed information.
We propose a novel structure-based de-anonymization attack, which does not require the attacker to have prior information (e.g., seeds). Our attack is based on two key insights: using multi-hop neighborhood information, and optimizing the process of de-anonymization by exploiting enhanced machine learning techniques. The experimental results demonstrate that our method is robust to data perturbations and significantly outperforms the state-of-the-art de-anonymization techniques by up to 10x improvement.