Skip to main content
padlock icon - secure page this page is secure

Reasoning about unstructured data de-identification


The full text article is not available for purchase.

The publisher only permits individual articles to be downloaded by subscribers.

We frame the problem of de-identifying unstructured text within the greater landscape of privacy-enhancing technologies. We then cover what sort of background knowledge can be gained from only stylistic information about a written document and how we can use research on authorship attribution and author profiling to improve our understanding about the sorts of inferences that can be made from an otherwise de-identified text. Finally, we provide a risk score for determining the likelihood that a message will be attributed to a particular author within a dataset using only author profiling tools.
No References
No Citations
No Supplementary Data
No Article Media
No Metrics

Keywords: anonymisation; author profiling; authorship attribution; de-identification; risk; unstructured data

Document Type: Research Article

Affiliations: 1: PhD Candidate, University of Toronto Co-Founder & CEO, Private AI 2: Professor of Computer Science, University of Toronto Co-Founder & Chief Science Officer, Private AI

Publication date: June 1, 2020

More about this publication?
  • Journal of Data Protection & Privacy publishes in-depth, peer-reviewed articles, case studies and applied research on all aspects of data protection, information security and privacy issues across the European Union and other jurisdictions, in the wake of the new EU General Data Protection Regulation (GDPR) and the biggest change in data protection and privacy for two decades.
  • Editorial Board
  • Information for Authors
  • Submit a Paper
  • Subscribe to this Title
  • Terms & Conditions
  • Ingenta Connect is not responsible for the content or availability of external websites
  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content
Cookie Policy
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more