Comments on thesis

General comments:

*please add*

< FIrst of all thanks a lot for your reply! I'm going to add the comments and hopefully mark them obviously. We will in the next couple of days pass around the current version of the paper to be published.

> Responses by zzz 2011-04-26

<< Responses by Michael 2011-05-02

*>> zzz 2011-05-03

Specific comments:

Sec. 3.1:

A routerInfo does not contain a "self-signed certificate". It contains a pair of public keys and a "null certificate", which is really just a placeholder for future stuff. It is, however, signed by one of the router's private keys.

< Thanks and fixed.

Sec. 3.3 Eepsites:

The "identifier" (what we call a "destination") is actually 387 bytes in binary and 516 characters when encoded in Base 64, not 517 bytes. Leasesets are looked up in the netDB using the 32-byte SHA256 Hash of the destination.

< Thanks and fixed.

Sec, 4.1:

Long paths might be much harder than in ref. 22; our absolute limit is 7 hops max (due to the max length of the tunnel build request message). As of 0.8.4, here are further restrictions enforced: Router A will not build a tunnel A-A. An unmodified router B will not build a tunnel A-B-A (although a hostile B could build this tunnel). A tunnel A-B-C-A cannot be prevented even with non-hostile B and C. So the longest without cooperation would be A-B-C-A-D-E-A. More complex long paths using multiple tunnels or garlic messages may be possible, although there are also maximum message expiration times enforced.

< We removed the claim that long tunnels might work.

Actually, I2P doesn't use peers from the same /16 in the same tunnel. It does allow multiples in the fast tier. Since your attack doesn't require two attackers in the same tunnel, the /16 restriction may not be relevant here.

< You need /16 restriction for inbound tunnels. Here the "guard" node is the last chosen. Discussion in the paper.

Fig. 4.2: outbound tunnel labeled as inbound; The "monitor peers" from Fig. 4.1 with red and black stripes are now labeled "A" in this figure, which is confusing.

< Great catch[[BR]]

Sec. 5:

You say that each peer was configured for 64 KBps max but isn't that true only for the 40 attack peers? What was the bandwidth configuration for the 30 monitor peers? Was 64 KBps really high enough to be included in the victim's fast tier? Or were the monitor peers modified such that they would only provide "fast" service for the victim?

< Both, monitor and attack peers run with 64 KB/s. Monitor peers just participate in the network and behave as any other node. (No special behavior to the victim)

> Hmm that's discouraging. Obviously the attack is more powerful if the monitor peers don't need to be high-bandwidth. I think what we are seeing is the potential of attacks where the monitors are only "nice" to potential victims, as a way of appearing fast to that victim and getting into their tunnels quicker (even if that's not exactly what you did). In any case, I'm surprised that it didn't take more bandwidth and will have to look into it further. As you know the primary defense against DDos is to increase resource requirements and we have to figure out how to do that.

<< Just to be clear, the 64 kB/s limitation comes from Planet-Lab. We would have configured with a higher bandwith, if we could have. I'm not sure it's possible to behave better to some peers. After all, the victim is at the beginning totally anonymous and at the top of my head I can not think about a mechanism that limits tunnel participation of the monitor peers without significantly reducing the possibility to reject tunnel requests of the victim (which would hurt badly). Anyway, I feel like this is a small detail for our work.

*>> Yeah, not so much an issue for the paper, but something for us to think about.

Table 5.2: The network size is estimated to be about 2500 uniques per day, and about 6000 - 7000 uniques per month (source )

< Good to know.

Figures 5.4 and 5.5: What's the difference between these two figures? Just two different examples?

< Yes.

Table 5.5:

What about 3-hop, which is the default for eepsites? 2-hop eepsites are not very secure and 1-hop is trivial. Really need data for 3-hop.

< 3-hop requires handling of A-F-A-V-B-C-D case (A=attacker, V=victim, B/C/D=bystanders, F=false-positive), which is easy (looking at traffic volumes not changing) but was not (yet) implemented (hence no data presented); preliminary data shows that the overall signal strength is not significantly disminished. 1-hop and 2-hop data sufficiently strong. Bottom line: If we have time to add the data, it will be added.

> I didnt expect you would have time to run more experiments; but since your victim was not using the default 3-hop tunnels, some explanation of why you chose shorter hops, and a prediction of how the results would change (if at all) with 3-hop tunnels is clearly appropriate.

<< There is a explanation on that in the paper, right?

*>> Not that I could find - I skimmed the paper again and also searched for "3-hop"

"for the duration of the measurement": How long was it? minutes, hours, days? The time-to-deanonymize would be good to include here. It isn't clear if you deanonymize in one tunnel lifetime (10 minutes) or it takes multiple successful placements of the monitor peers over a long time period.

< 4 hours test data. (Addressed in the paper) Every point is a deanonymization as we described it. Tunnel participation in inbound and outbound tunnel, for 10 minutes.

> OK. I didnt see any mention of the actual duration.

Sec 6 Discussion:

The I2P network is still relatively small but is growing quickly. How about a prediction or sensitivity analysis for a network 10X, 100X larger? The analysis starts with "a" monitor peers out of the victim's fast pool of 30 peers. There's no analysis or discussion of how many monitor peers of a given bandwidth you need in the entire network to attain the number "a" in the victim's fast tier.

< We don't need a constant number of monitors in the victim's fast tier all the time, and obviously this depends on how many attackers, how many monitors, how fast the rest of the network is AND on the configuration of the victim. Too many factors for a controlled experiment with a clear answer. What we show is that an adversary controlling X% of your network can use a DoS on the fast tier-estimate to have MORE than X% of his peers in the victim's fast tier.

*> Following paragraph reworded. > Right. So we're getting to a much clearer statement of the issues. Let X = c/N where c is the number of attack peers and N is the total network size… or N equals the number of fast peers. If an attacker has X% of the network, or X% of the fastest routers, the network design goal is to prevent an attacker from doing anything to artificially increase the odds of getting into a tunnel to more than X%. Ideally we would like the odds of success of the attack to be X2, or the effort to be O(X2) (two attacking routers in the right place in the tunnels - one cell enough). What we (I2P) want to do is enhance our defenses so that the odds really are close to these ideals - in other words, that an attacker cannot influence things to increase his odds of success or reduce the time or effort. As we say on the threat model page, if an attacker owns enough of the network then a user is at high risk (i.e. if X is high enough, X2 is too high for practical anonymity).

<< I don't really get this statement. TODO

*>> Reworded above

In fact, most fast peers are from a Class "O" (greater than 128 KBytes/sec) group of routers and those are about 20% of the network - so there's perhaps 400 peers that could potentially be in the fast group in today's network of 2000 - 3000 routers.

< QUESTION: How would an I2P router with 64 KB/s even obtain a speed value measurement > 64k for any other peer? < Also, even Class "O" peers would be DoSed? by our attacker and the victim would (eventually) choose our non-class-O monitor nodes.

> That's a great question. You would think that, to a router of bandwidth B, all peers with bandwidth ≥ B would appear equally "fast" to that router. Yet, by simple observation of the profiles page in my (long-running, low-bandwidth) router console, the vast majority of fast and high-capacity tier routers are class "O". Why is that?

> First, the speed calculation is a measurement of the peak 1 minute speed for a single tunnel. Second, since peers route tunnels for multiple other peers, it's really a measurement of *marginal* speed available, and faster routers have more marginal bandwidth available. Third, a router must be in the high-cap tier to be fast, and the higher-bandwidth routers have more tunnel capacity. Fourth, routers are somewhat biased to have the floodfills in their fast and high-cap tiers, since they talk to them frequently, and floodfill routers are always class "O".

> So maybe what this all means is that in the normal operating case, most of the fast tier peers are actually fast, but in an attack scenario, relatively low-bandwidth peers could still still get into the tier relatively quickly, especially if they are "nice to the potential victim" (see above in discussion of monitor peer bandwidth).

<< So you are not completely sure about this either? I think it's worth and important to figure this out!

*>> agreed

So isn't this really about an adversary taking over a large proportion of the entire network, or at least of the network's fast routers? Is I2P any more vulnerable at X % hostile peers compared to other networks? Once you have a large number of hostile fast peers in the network, is the traffic analysis of your attack any quicker or more reliable than other attacks, e.g. first and last node in a tunnel (ref: "one ping enough" paper or blog post about Tor)

< Really neither. It is about leaking information about lease sets (vuln. 1), performance-based selection (vuln. 2), long-running services (vuln. 3) and a lack of defense (no guard nodes) and about the possibility of using DDoS attack to change the network-view of a *single*, targeted peer (who's identity is NOT known a-priori, see below).

Also not discussed - effect of leaseset size (number of leases or inbound tunnels) which is user-configurable from 1 to 6. It also is configurably dynamic, with less leases when the server is idle. A high number of leases makes it quicker for an adversary to enumerate the fast peers. I assume you used the default setting of 2 leases for your experimental victim.

< Our victim was configured for only 1 lease, more leases would have made our attack easier, so we went for the worst-case here. But yes, we mention it now ;-).

Also not discussed - you started from the end, i.e. the identity of the victim, then your whole experiment was to confirm it. To find the victim from scratch would require another O(n) in time or resources, where n is the size of the network, i.e. you have to run the experiment on each router in the network.

< Not correct. While we did the experiment against an Eepsite that we had set up so that we could be sure it worked and minimize impact on real sites, a real adversary would not have n-times more work; the adversary does NOT start with a particular router in mind (this is NOT a confirmation attack) but rather starts with a hostname, looks up leases and then does HTTP requests via the tunnels specified in the lease and DoS on entry nodes from the lease. So while we knew the correct answer, the statistical analysis code / detection routines did not at all have any particular router as a target in mind when the code was executed.

> Oops, of course you are right, I understood that both before and after I wrote that comment but not during :)

<< That happens. :)

Unidirectional tunnels as a "bad design decision":

A bold and absolute statement not fully supported by the paper. It clearly mitigates other attacks and it's not clear how to trade off the risk of the attack described here (either currently or after implementing one or more of the recommendations below) with attacks on a bidirectional tunnel architecture.

< Yes, this is a bit of a bold statement, but only because we focus on one particular attack / system design aspect, and we might be missing out other issues. However, we're not aware of any research showing advantages of uni-directional tunnels at this point that would mitigate the issues raised by our analysis. Please let us know if we miss something.

> We arent aware on any published research either, but here is some comment by "Complication" and then some links to info on our website, documenting the design decision:

<Complication3> unidirectional tunnels appear to make it harder to detect a request/response pattern, which is quite possible to detect over a bidirectional tunnel (TOR circuit) <Complication3> since some apps and protocols, very notably HTTP and its brethern, do transfer data in such manner, having the traffic follow the same route to its destination and back, could make it arguably easier for an attacker who has only timing and traffic volume data, to infer the path a tunnel is taking from those <Complication3> having the response come back along a different path, arguably makes it harder <Complication3> having proper cover traffic naturally also does

> http://www.i2p2.i2p/tunnel_discussion#tunnel.bidirectional

> http://www.i2p2.i2p/techintro#similar.tor (3rd par.)

> http://www.i2p2.i2p/meeting125 (~13:12-13:30)

> I still don't think you have justified the "bad design decision" phrase so let me try again:

> 1) It's based on an arbitrary certainty vs. time weighting (tradeoff) you have decided that may not be applicable in all cases. For example, Somebody could make a list of possible IPs then issue subpoenas to each. You could use your botnet to DDoS each in turn and via a simple intersection attack see if the eepsite goes down or is slowed down. So close may be good enough, or time may be more important.

> 2) A full analysis of the tradeoffs of unidirectional vs. bidirectional tunnels is clearly outside the scope of this paper, and has not been done elsewhere. For example, how does this attack compare to the numerous possible timing attacks published about onion-routed networks? Clearly you don't have the intention of doing that analysis, if it's even possible to do that review effectively.

> 3) Tor uses bidirectional tunnels and has had a lot of academic review. I2P uses unidirectional tunnels and has had very little review. Does the lack of a research paper defending unidirectional tunnels mean that it is a poor design choice? I think it just means that it needs more study. Timing attacks and distributed attacks are difficult to defend against in both I2P and Tor. The design intent (see references above) was that unidirectional tunnels are more resistant to timing attacks. Yours is a somewhat different type of timing attack though. Is your attack, innovative as it is, sufficient to label I2P's tunnel architecture (and thus I2P as a whole) a "bad design", and by implication clearly inferior to Tor, or is it just a design alternative that clearly needs further investigation and analysis? There's lots of other reasons I would consider I2P currently inferior to some other projects (small network size, lack of funding, lack of review) but is unidirectional tunnels really a reason?

> 4) In summary, "bad design decision" is clearly (to me anyway, since you have not labeled bidirectional tunnels "bad") shorthand for "unidirectional tunnels are unequivocally inferior to bidirectional tunnels", yet this statement is not supported by the paper, nor do I think it was the intended scope of the project.

<< I certainly understand your intention to defend uni-directional tunnels. Our main point is: Deanonymizations on uni-directional tunnels take a longer time (fair to say it's an advantage), but we can be way more certain in the uni-directional case (in fact I think this is exactly Complications statement). So, for this work, we do not see an advantage of uni-directional tunnels. Of course just for the researched setting with long living Eepsites. Can you argue with that? The text in the paper now says: "using uni-directional SEEMS to be a bad design decision". Certainly any results disprove this statement are welcome. I hope you're fine with that.

*>> I appreciate the "seems", but I think we're agreeing to disagree here, which is OK. But since I'm thinking about it now, I'll try one last time, in a single sentence, addressing my point #1 only: Your conclusion is based on your own weighting of the importance of certainty vs. time, and that weighting may be wrong, and it's definitely debatable, especially in a real world with subpoenas, search warrants, and other methods available for final confirmation.

Once the attacker's routers are a large portion of the victim's fast tier (e.g. 'one ping enough'), all sorts of analysis and attacks are possible, and many would be the same or easier with bidirectional tunnels. While we appreciate the innovation of your timing analysis attack with our unidirectional tunnels, the victim is eventually owned via any number of attacks when his fast tier is overtaken.

< This is all a question of degree: how fast can the adversary be how certain of the identity of the user. You are right that in general, the attack still applies to bidirectional tunnels as well.

> See above

Also, given future increases in network size and implementation of recommendations, the tradeoff of analysis time vs. false-positive rate may change. For example, a 30x increase in analysis time for unidirectional tunnels is not insignificant. What if the fast tier size was increased to 1000? Then the time would be 1000x.

< Yes, and accuracy would be x1000000. So while now the adversary might want to observe the victim in the correct position in 100 tunnels for 99.9999% certainty, the same might be achieved with just a single observation once the fast tier grows to 1000 peers (numbers given are random values, not based on specific facts, other than the x1000000). Also, given that the adversary has "infinite" time (Eepsites are online a long time), the time increase is likely to be not so dramatic, whereas accuracy is something that might in general be harder to improve.

> See above

Paper's recommendations:

> Note that my original comments below are not necessarily suggestions for your paper, some are just thinking out loud about countermeasures.

<< Okay, will do the same.

*>> I'll comment briefly here but I've moved further discussion of countermeasures to http://zzz.i2p/topics/895 .

1) Limit churn:

Possibilities: Increase 45 sec evaluation cycle, increase 30-peer fast max and/or 75-peer high-cap max. Downside: larger group makes it more likely for an attacker to eventually be in a tunnel.

< We did not suggest to simply change the tier sizes or evaluation cycles. The suggestion was rather to limit churn, for example by saying that per evaluation cycle, only one peer can be removed from a given tier (adding more might be OK during startup). You might impose further limits, for example that at most 5 peers can be removed (and not re-added) within 10 cycles, etc. That would force the adversary to maintain any DoS for longer, still allow you to have the "fast" tier be small and fast AND possibly help your network stability. 45s might also be a bit fast, but changing that value alone would neither be enough nor go to the heart of our suggestion.

> I guess I used "churn" loosely or perhaps redefined it - I was thinking both about one possible goal, which is to limit the rate of change of Inbound Gateway (IBGW) routers, i.e. leases, to slow down fast tier enumeration. And another goal, the more general reduction of the rate of change of tunnel participants.

<< I think this hurts the performance of I2P very much. Assume you limit the change rate of the IBGW and they are under attack. What would happen when they start to perform worse? Would you still use them?

*>> Sure, that's the problem with any fixed or semi-fixed strategy. You eventually have to switch if they get attacked or simply vanish. The slower you want to change, the worse performance you suffer with before you finally do switch. See http://zzz.i2p/topics/895 for more discussion.

Not a possibility: Increasing 10-minute tunnel lifetime (unfortunately it is essentially hard-coded in the network now)

< This was not suggested and would likely rather be harmfull (more data points for statisitcal analysis for the adversary).

> Right, I was just thinking out loud here.

2) Distributed HTTP services:

This is supported via "multihoming", whereby multiple routers may host an eepsite. This requires some additional setup, and of course requires the user to operate multiple routers. Truly distributed hosting is under development through a port of the tahoe-lafs distributed file system to I2P.

< Okay, we mention is ongoing process as possible solution.

> Tahoe-LAFS is of course just one of a number of possible distributed or DHT-style file systems that could be overlaid on the I2P base technology. It's been said that I2P won't really be complete until we have a solution here. Maybe that's true. But certainly we want to make hosting of a service on a single router as robust and anonymous as we can make it.

<< Noted. Distributing (dynamic) content is very difficult to handle and a hot topic for research, so I can understand this statement. Of course you'll suffer from the service being online a very long time on the same machine. But that's maybe the price you have to pay to not run into other problems.

3) Use random peers for leases (guard nodes):

By this you mean, I think, using random peers outside the fast tier for the inbound tunnel's gateway. We could also keep these peers semi-constant, or more stable, by attempting to recreate the same tunnel at expiration, while still changing them on rejection. This could be done either from the fast tier or by using a random peer. Perhaps each destination could maintain its own "guard tier" that changes slowly.

< You seem to confuse guard nodes (those closest to the victim) and 'entry nodes' (those advertised in the leaseSet). Our suggestion was to pick a completely RANDOM (not-tier, not-fixed) node for the 'entry nodes', so that the adversary cannot learn about your fast tier from the leaseSet. Obviously, there are trade-offs involved.

> yes, I was using "guard nodes" loosely. See above also.

Benefits / downsides? What happens if an adversary attacks guard nodes (either in I2P or Tor)?

< Again, the point here was that the adversary should not be able to figure out the inbound nodes in the first place by not leaking information about likely fast-tier nodes in the leaseSet. We clarified this in the text.

Additional possible changes to I2P not mentioned:

1) Increase resistance to low-bandwidth tunnel building DDoS attack:

Changes made in 0.8.4, more coming in 0.8.5, further research needed

< Yes.

2) Connection limits (limiting number of requests per client in a minute, hour, or day) are supported in i2p but not enabled by default, these limits, if enabled, would prevent a single client from making requests every 15 seconds forever, although a distributed attack would still be possible. In addition, the server sends a connection reset if the limit is exceeded, it would have to drop the request instead.

< Yes.

3) Would server-imposed response delays help?

< We tried to avoid suggesting measures that obviously trade performance for security. I2P is a low-latency design, so adding artifical delays would seem a rather significant change. If done right, yes, they would likely make the attack harder —- at the expense of making I2P slower; this risks reducing the size of the userbase and hence might do more harm then good in the end. Tough call.

> Right, tough call, was again just thinking out loud here.

4) Disallow multiples from the same /16 in the fast tier

< Given that our adversary controls nodes in many /16s, this suggestion would not apply in the context as we presented it, but of course might be a good idea in practice.

5) Increase fast and high-capacity tier maximum sizes

< Not clear (time/accuracy trade-offs), and naturally at some point you'd have everyone in all tiers, so again trade-off between (to us not easily quantified) performance benefits from tiers and security issues (from non-random/biased/manipulateable) performance-based peer selection.

Sec 7 Conclusion:

What the "devs decided":

1) Timetable of 0.8.4 release:

Released March 2, installed in 25% of network by ~March 4, 50% by ~March 6, 75% by ~March 14 (source )

< Okay.

2) Relevant changes in 0.8.4 release:

a) Prevent tunnel-building DOS by a single source. This was done in reaction to the attack.

< NOTE: Our DDoS involved 40 peers hitting a single peer.

> Right. What I meant to say in a) above, (and I elaborated the details in a conversation with you) was a limit on the number of tunnels that can be built in a specific time frame with a single peer as the previous or next hop. These limits are fairly low, such that it will take a large number of peers. This number may be more or less than 40 depending on the capacity of the victim. You were (I think) building one-hop tunnels in the attack. The new restrictions can be evaded by building longer tunnels, with a variety of other peers in intermediate hops. So the new restrictions add some marginal cost but are not a complete solution.

<< I see. As far as I remember did you mention that peers reject tunnel participation requests after this low limits are exceeded. You may want to ignore such requests. Note: We do not care about number of tunnel participation. We merely induce traffic on the other node by tunnel requests. So if that peer still uses bandwith the DoS attack might still work.

b) Penalize peers more due to tunnel rejections. This did not change the time constants of the capacity formulas, just changed (a + r) to (a + 2r) in the denominator of the formula in section A.1. However it may have had the effect of reacting faster to a DOS attack. This change was not made in reaction to the attack, but was previously planned and is part of a strategy to spread the traffic across more peers in the network and adjust the forumla in response to network conditions that have changed markedly in the past two years.

< Oh, OK, we thought it was a reaction to our activity. Removed.

3) More changes to detect and prevent DOS are upcoming in 0.8.5 (scheduled for release the week of April 18) but these are not a complete solution. A fully distributed tunnel-building DDsS is difficult to prevent completely.

< Right, which is why we said that simply trying to prevent DoS was not the best answer.

Reacting to performance changes as a "bad idea":

A bold and absolute statement not fully supported by the paper. Clearly there are steps we can take to mitigate the attack, especially by detecting and reacting to the DDoS and other items discussed at the end of Section 6. Also, the network is still very small. We aren't making absolute claims about our anonymity especially when the whole network is 2000 - 3000 routers. Yes we still have work to do to ensure anonymity while doing performance-based peer selection and there are tradeoffs involved. But don't write it off as a "bad idea". A more nuanced statement would be that anonymity vs. performance is a fundamental tradeoff, and that the speed of reaction to a peer's perceived performance change affects that tradeoff, and that perhaps i2p's reaction time is too fast for the anonymity that its users expect, and should be adjusted.

< That was not our statement. We said that reacting to performance changes *quickly* was a bad idea. This refers to the (somewhat theoretical) possibility of the entire I2P fast-tier being replaced by other peers quickly IF the adversary targets the right victims with DoS (in practice, as we have shown, the adversary may not be able to determine the correct set of victims that precisely, but your implementation has no 'speed limit' for the evolution of the tiers (ok, 45s…), which is what this is about). If you limited churn instead (see above), the reaction would take longer, forcing the adversary to maintain the DoS for a longer time.

> I think what you are trying to get across is that there is, of course, an anonymity/performance tradeoff in how fast a router reacts to peer performance changes and that I2P has not correctly balanced that, that we need to react (much?) more slowly than we do now. I don't think that nuance is reflected in "This work shows that peers reacting, and especially reacting quickly, to changes in observed network performance can be a bad idea for anonymizing networks."

<< My main goal was not to refer to a performance/anonymity trade off. Yes, there is the possibility that if you react quickly on performance changes you end up with a higher overall performance. But at least I feel like you can easily change some behavior (45s evaluation cycle) without risking to end up in a really bad overall performance. This is a very general statement, which (I think) our paper shows. I'm not sure we criticize I2P here.

> There's been a lot of research ("one cell enough", etc) on the perils of rapidly changing the peers and peer positions in tunnels. I2P has benefitted from and learned from this research - we use strict ordering of peers in tunnels, we have a 24 hour window for stats, etc., to limit the speed of change. So we are certainly aware of the issue, even as we know we currently have much more frequent tunnel peer change than Tor. However with a lack of guard nodes, lack of constant leases (entry nodes), and fairly rapid fast tier replacement when the fast tier is DDoSed, it appears that the current implementation does react too quickly.

<< I think it's fair to say, that I2P has some room of improvement here.

*>> :)

Sec. A.2 Integration value:

This isn't used in I2P for anything except a display and isn't relevant to the paper. You may also wish to remove the information about the well-integrated tier from sections 3.2.4, 3.2.5, and B.1.

< We'll consider this. Thanks for confirming that well-integrated is really used for nothing.

Last modified 9 years ago Last modified on May 3, 2011 2:01:46 PM