[rssac-caucus] FOR REVIEW: Harmonizing the Anonymization of Queries to the Root

Tue Feb 27 00:18:39 UTC 2018

> On Feb 21, 2018, at 2:53 PM, Paul Hoffman <paul.hoffman at icann.org> wrote:
> 
> On Feb 13, 2018, at 2:59 PM, Wessels, Duane via rssac-caucus <rssac-caucus at icann.org> wrote:
>> In addition I would really like to see some kind of summary (table perhaps) that presents the following for the various techniques:
>> 
>> - advantages / disadvantages
> 
> I don't think that is possible to do in a clean fashion. The advantages/disadvantages change radically if you are a:
> - RSO
> - Researcher
> - Person who wants your IP address completely anonymized

Really?  The document already talks about advantages and disadvantages (sec 3.1, 3.2, 3.3, appendix C) and AFAICT presents them without any commentary on "who" you are.  I was proposing a summary table.

>> - cryptographic strength (I realize this could be difficult since not all are well-studied at this point).
> 
> You also have to define what you mean by "cryptographic strength". If you mean "how much effort would I need to find the random key so I can de-anonymize the rest of the dataset", 3.1 (mixing with truncation) would require 2^128 operations, 3.2 (Cryptopan) would require 2^128 operations unless the RSO used shortcuts to keep certain CIDR classes together, and 3.3 (ipcrypt) should take 2^128 if there are no attacks on the cipher.

Yeah thats fair.  I don't have a good definition of cryptographic strength in this case.  Maybe for someone it means de-anonymizing the entire dataset, but for someone else it means de-anonymizing just a few sources.

> 
>> - efficiency (i.e. CPU time to anonymize some amount of (DITL) data).
> 
> That's also difficult to measure given that no one has spent time optimizing the implementations. Please remember that you will only be running the mixing function if the mapping does not already exist in the table, and that will be true for the vast majority of the time unless you are under a DDoS that is using randomized source addresses. Also, if you about about to change the key and start another run, you can pre-fill in the table from the previous table and reduce the processing time even further.

I would be interested in the unoptimized cases -- ie stateless anonymization of every address, without any lookup tables.  IMO lookup tables are an implementation decision, not a characteristic of the algorithm.

But I also think it probably doesn't really matter.  In my simple tests they were all "fast enough".

> 
>> - whether or not "decryption with the same key" is a property of the technique
> 
> That is only a property of 3.3 (ipcrypt)

Thanks.

> 
>> - known implementations
> 
> For 3.1, the implementation is trivial. For 3.2, there are links to the implementations we know about (although they are not well documented). For 3.3, the implementation is given in the reference.
> 
>> Also I would like to better understand if the different techniques have any different cryptographic properties when there is at least one known true -> anonymized mapping.  I think we should assume it is trivial for a consumer of the anonymized data to inject beacon queries that would enable them to know the anonymized value of a specific source IP.
> 
> For 3.1, there is no linkage between any mappings: that's inherent in AES. For 3.2, there is a linkage if the mapping is in the same prefix as the address in question. In 3.3, if there is no known problem with the algorithm, there is no linkage between any mappings.
> 
> --Paul Hoffman

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4675 bytes
Desc: not available
URL: <http://mm.icann.org/pipermail/rssac-caucus/attachments/20180227/b620c738/smime.p7s>