[rssac-caucus] FOR REVIEW: Harmonizing the Anonymization of Queries to the Root

Wed Feb 14 15:31:01 UTC 2018

I just read the thread in dns-operations and burts blog[1] seems to be discussing similar concerns to mine and the following link is a good read

https://iapp.org/news/a/top-10-operational-impacts-of-the-gdpr-part-8-pseudonymization/

[1]https://medium.com/@bert.hubert/on-ip-address-encryption-security-analysis-with-respect-for-privacy-dabe1201b476


> On 14 Feb 2018, at 14:21, John Bond <john.bond at icann.org> wrote:
> 
> Hi Andrew, and WP,
> 
> Thanks for this document it looks very good, see below for comments
> 
>> 1.2 Terminology 
> I think we should just reference RSSAC026 instead of repeating the definition of RSO in this document
> 
>> 2. Introduction to Anonymization
> Duane already made a comment other identifiable information in DNS packets on this however i wanted to specifically highlight ENDS client subnet and suggest that anything that works on the IP source address should also work on the EDNS client subnet if present
> 
>> 2.1 Benefits and Drawbacks of Harmonization of Anonymization
> When discussing the drawbacks the document only concerns itself with key distribution issues and doesn't address any of the privacy concern.  It seems to make the assumption that the datasets have to be harmonised so research can continue.  This may be by design however i think that the document should at least mention that this harmonisation of data does make it easier to personally identify individuals.  INAL but anonymization of data in this manner may not be enough to prevent it from been considered personally identifiable when considering things such as GDPR.  especially when you enter into the fact that Third parties, not under the jurisdiction of the EU have access to the shared key(s).  If i was to consider privacy vs abillity to research then the following options would seem to be worth considering in order with the highest level or privacy and most difficult to research first.
> 
> 
> 1) remove IP addresses completely
> 2) Each operator encrypts the IP address with there own key and rotates the salt every x minutes
> 3) Each operator encrypts the IP address with there own key 
> 4) operators encrypts the IP address with a shared key and rotates the salt every x minutes
> 5) operators encrypts the IP address with a shared key 
> 6) no change
> 
> 
> In my mind option 2 and 4 are worth considering as it would allow researches the ability to track patterns and see data shiffting, but would make it difficult to track an individual user across the entire time series.  Im not a researches so don't know what impact this would have but i think it adds a lot to the privacy of the data set.  for instance in the schemes suggested if you see that IP 192.0.2.1 (or whatever it is hashed to) always goes to smtp.johnbond.org then you can probably assume that  IP 192.0.2.1 belongs to me if IPs only ever have a one-to-one mapping then someone could track my usage through the entire time series.  It makes little difference that 192.0.2.1 is not my real IP addresses and has been anonymised.  
> 
> 
>> 3.2 Mixing Bit-By-Bit: Cryptopan
> The cryptopan paper acknowledges that due to the one-to-one mapping it is susceptible to know plain text attacks[1] and some services will be trivial to identify regardless of how we anonymise them.  I wonder if we could get the paper authors to re-run there attack scenarios on a Cryptopan encrypted DITL and see how much of the data the could be de-annonymise 
> 
>> 3.3 ipcrypt
> The one-to-one mapping also means it is susceptible to a know plain text attack but to what severity is unknown however the lack of prefix preservation would likely make any attack harder [then Cryptopan attacks]
> 
>> 4 ASN and recommendation 3
> I'm strongly apposed to this as i it would make de-annonamising the information and the know text attacks mentioned above much simpler to execute. 
> 
> Thanks John
> 
> 
> [1]https://www.cc.gatech.edu/computing/Networking/projects/cryptopan/icnp02.ps
> On 13 Feb 2018, at 13:19, Andrew Mcconachie <andrew.mcconachie at icann.org> wrote:
>> 
>> Dear RSSAC Caucus Members,
>> 
>> On behalf of the RSSAC Caucus Work Party on Harmonization of Anonymization Procedures for Data Collecting, please find Harmonizing the Anonymization of Queries to the Root v1 attached.
>> 
>> Please send your comments and/or additions to the list by February 27th, 2018. Depending on the volume of comments received the work party may then decide to create a new version or forward v1 to the RSSAC for a vote on publication.
>> 
>> Thanks,
>> Andrew
>> 
>> 
>> <RSSAC0XX_Harmonizating_Anonymization_Queries_Root_v1.docx>
>> <RSSAC0XX_Harmonizating_Anonymization_Queries_Root_v1.pdf>
>> _______________________________________________
>> rssac-caucus mailing list
>> rssac-caucus at icann.org
>> https://mm.icann.org/mailman/listinfo/rssac-caucus
> 
> _______________________________________________
> rssac-caucus mailing list
> rssac-caucus at icann.org
> https://mm.icann.org/mailman/listinfo/rssac-caucus