HN2new | past | comments | ask | show | jobs | submitlogin

Isn't that the same? If it was properly anonymized, it couldn't be de-anonymized. Which implies it's not anonymized, i.e., it's store de-anonymized.


No, it's not the same. It could be stored anonymized, and the de-anonymizing data is somewhere else, and it can ONLY be accessed with a court order. I don't know why my original comment is being downvoted, it's an important distinction.

Edit: stop with the downvotes please. Whether you agree or not, anonymizing something does not always mean it cannot be de-anonymized. And who can do it (and under what circumstances) is important.


If the de-anonymizing data exists at all, then the anonymized data was never truly anonymized in the first place.


Anonymizing does not mean that it has to be one-way. You can give other people an anonymized version of your data, but you can keep the key to deanonymize that data (and hand it out selectively). I don't know who is assuming that anonymizing means the information has to be thrown away and lost to EVERYONE.


If someone can use that information to identify you, then the data is by definition not anonymized. It doesn't matter how exceptional the circumstances where that's allowed to happen are. "Fully 100% anonymous, unless we label you a terrorist" is not the same as anonymous.


If you're going to nitpick like that, then no data is ever anonymized if it contains any information at all. When you start combining pieces of data, they all contribute information that helps you narrow down individuals until there is only one possible match.


Read up on differential privacy and k-anonymization. There are commonly implemented best practices for measuring and preserving anonymity in a dataset in non-reversible ways. It usually involves aggregating clusters of data and dropping clusters with too few unique contributions.

These techniques have a long track record in the private sector and with public entities such as the US Census, with a lot of formal research to back it up.


It's not nitpicking. Your definition of "anonymized" leads people to believe they are anonymous when they are not. That can lead to serious consequences.


It’s not my definition. You’re twisting words. Location data is just not “anonymizable” at all because it’s always possible to combine it with other sources.


This does not work for most location datasets. It is easy to Re identify users in this type of dataset with a few lines of code. The identifying information is embedded in the locations and time stamps themselves. Research shows 4 randomly selected location points from a phone is all that is needed to uniquely identify 95% of the population.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: