t-closeness And Data Anonymity

Kevin Feasel

2018-11-02

Data

John Cook shares some thoughts about k-anonymity and t-closeness:

The idea of k-anonymity is that every database record appears at least k times. If you have a lot of records and few fields, your value of k could be high. But as you get more fields, it becomes more likely that a combination of fields is unique. If k = 1, then k-anonymity offers no anonymity.

Another problem with k-anonymity is that it doesn’t offer group privacy. A database could be k-anonymous but reveal information about a group if that group is homogeneous with respect to some field. That is, the method is subject to a homogeneity attack.

This is intended to be a “get you thinking” type of post, and John does have links to related posts which flesh things out a bit more.

Related Posts

Containers and Data

Randolph West argues that you should keep data and containers separated: Where it gets interesting is that the SQL Server container is also where the database files are stored by default. I raised a point (which Grant and others have already noted in the past) that persisted storage volumes allow us to throw away a SQL Server […]

Read More

Azure Open Datasets

Jen Stirrup is pleased that Microsoft is bringing back open datasets: Nearly three years ago, I complained bitterly about the demise of Windows Datamarket, which aimed to provide free, stock datasets for any and every purpose. I was a huge fan of the date dimension and  the geography dimension, since they really helped me to get […]

Read More

Categories

November 2018
MTWTFSS
« Oct Dec »
 1234
567891011
12131415161718
19202122232425
2627282930