While I don't care so much who got notified first (any given embargo timeline is going to frustrate large numbers of HN people; if that's a problem for you, start finding bugs or post a bounty), I find several things not to like in Cloudflare's marketing. Near as I can tell, Cloudflare is the origin of the notion that TLS private keys weren't going to be in the heap near packet data, a supposition they jumped to for no reason I can discern.
They didn't find the bug. They benefited from a heads-up on the bug. Then they promoted a mistaken assumption about the bug, along with a gamified challenge site that was in fact a poor vehicle for investigating nginx+OpenSSL (how about, for instance, any decent debugger instead?).
Meanwhile, ops teams at big companies were in heated debates with security about whether keys needed to roll.
There are good people working at Cloudflare and I'm not part of any outrage battalion. I'm just not a fan of how they handled this particular incident.
To be fair, Cloudflare didn't just jump to this conclusion for no reason, they apparently did test this by looking at the location of request buffers compared to the private key, trying to extract it themselves, and investigating the heap layout as a whole: http://blog.cloudflare.com/answering-the-critical-question-c... They probably just got some or all of their assumptions wrong, as many have.
I did what Willem did: I instrumented a small OpenSSL driver program and snapshotted memory. I did not go through the effort Jeremi Gosney and Willem and Thai and Ben Murphey went through to trace things through the code, but it was immediately apparent that there was more going on than the blog post Cloudflare wrote suggested.
More importantly, the whole thesis of that blog post is that RSA private keys are loaded into memory once and never moved again. But that's obviously not true: intermediates based on private key components are created during Montgomery multiplication, for instance.
The bigger problem is not that Cloudflare got things wrong. It's that they (a) marketed the wrong conclusion, and (b) put the conclusion to trial in a way that spent smart people's time unnecessarily.
Here was our initial conclusion from the CloudFlare post you referenced:
============
We think the stealing private keys on most NGINX servers is at least extremely hard and, likely, impossible. Even with Apache, which we think may be slightly more vulnerable, and we do not use at CloudFlare, we believe the likelihood of private SSL keys being revealed with the Heartbleed vulnerability is very low. That’s about the only good news of the last week.
We want others to test our results so we created the Heartbleed Challenge. Aristotle struggled with the problem of disproving the existence of something that doesn’t exist. You can’t prove the negative, so through experimental results we will never be absolutely sure there’s not a condition we haven’t tested. However, the more eyes we get on the problem, the more confident we will be that, in spite of a number of other ways the Heartbleed vulnerability was extremely bad, we may have gotten lucky and been spared the worst of the potential consequences.
That said, we’re proceeding assuming the worst. With respect to private keys held by CloudFlare, we patched the vulnerability before the public had knowledge of the vulnerability, making it unlikely that attackers were able to obtain private keys. Still, to be safe, as outlined at the beginning of this post, we are executing on a plan to reissue and revoke potentially affected certificates, including the cloudflare.com certificate.
Vulnerabilities like this one are challenging because people have imperfect information about the risks they pose. It is important that the community works together to identify the real risks and work towards a safer Internet. We’ll monitor the results on the Heartbleed Challenge and immediately publicize results that challenge any of the above.
============
Which is exactly what we did. To be clear, we were wrong. Our mistaken assumption was focusing on the private key itself and not focusing enough on the exponents that are used to generate the key -- which is what the researchers who solved the challenge were able to obtain. As the wishy-washy conclusion above hopefully makes clear, even when we said it was hard to get the private keys, we were very uncertain and uncomfortable with that conclusion. That's why we posted the challenge. What the challenge did was answer the question definitively: you can get private SSL keys. Knowing that has been valuable for us in deciding to accelerate reissue/revocation process for all the certs we manage on behalf of our customers. Remember: at our scale, revoking hundreds of thousands of certs risked breaking our CA partners' infrastructure, so it wasn't without a cost. Knowing the risk is higher than we originally thought accelerated our efforts which will be complete in the next 48 hours. And, beyond CloudFlare, our hope is knowing the risk is real and proven has benefited other organizations as well.
Look, I'm arguing this because I'm a nerd and because I know you're wrong, not because I think this is a moral crusade or anything. But, once again:
Your "Cloudflare Challenge" (that's what you called it) was not a particularly useful way to answer the question posed by Heartbleed. What you want to know is, "is there private key material in heap memory?", or, to make things even simpler, "are our assumptions about how key material hits heap memory accurate?". The correct way to answer this question is to instrument and analyze an OpenSSL/nginx runtime, not to create and market a treasure hunt for an undisclosed private key on a single site.
You employ smart people. You could have done better than this challenge. Instead, what seems to have happened is that your company got inserted into the middle of a story it had little to do with (correct me if I'm mistaken and unaware of something your team did to research Heartbleed), and, with that spotlight shining, actively marketed a harmful false conclusion about the bug while also bidding for the spare cycles of other people who might have been more effective doing something other than poking at your server. (For what it's worth, I don't think for a moment that you did either of those things intentionally).
I think if you did either of those two things differently --- either didn't publicly go out on a limb suggesting that you thought keys would be hard (or, as Bruce Schneier seems to have read from your blog post, "next to impossible") to recover keys, or didn't set up the game site while doing it --- I wouldn't be moved to comment.
Like I said, not after pelts. Just, if we're putting the Cloudflare response up for questioning, I have some issues to point out.
We spent 5+ days trying to get the private key. We, along with a lot of other smart folks, concluded it was unlikely. Within 11 hours of crowd sourcing the problem we were proven wrong. You may not have found that useful, for us having definitive proof definitely was.
It's interesting you put it that way, because it hadn't occurred to me that your team had an early heads-up on the bug before writing that blog post. So what you're saying is that for 5+ days, the team was working on the assumption that the only time OpenSSL RSA private keys touched heap memory was when they were first loaded.
It seems like you can just look at the code and see how that's not the case. But I might be wrong, too.
It is a tricky thing, being in the center of a critical vulnerability disclosure story.
Since other people seem unaware, I'm going to rip open a personal emotional wound and explicitly tell people: 'tptacek knows exactly what he's talking about because he did in the past, almost exactly, what Cloudflare did right now.
(If you really want to know the issue, google his name with "dns", but it's not really relevant to rehash his mistakes here, except for the fact that in his choice of words on this thread ("They didn't find the bug. They benefited from a heads-up on the bug. Then they promoted a mistaken assumption about the bug") are a screamingly obvious reference to his old mistakes.)
He's not being hypocritical; he's speaking from the voice of experience and having had things blow up in his face. He's trying to stop other people from making the same mistakes he did. When the guy with the big burn mark on his face talks to your chemistry class about the importance of lab safety, you should LISTEN.
"Experience is a dear teacher but fools will learn at no other." Sometimes your elders know what they are talking about.
(Now to re-bandage that wound. Another big piece of advice from another elder: people change and mature, yourself included. You should get over other people's mistakes.)
An expert's hindsight is 20/20. (Mehta's team had the bug 17+ days, and he still tweeted his reassurance on 4/8 with "#dontpanic".)
The challenge result instantly convinced a lot of people who still had doubts, because of the mixed messages elsewhere. The sideshow drama, much like the catchy name 'heartbleed' before it, worked perfectly for its intended purpose.
Could you shed some light on how this research was conducted? From reading the OpenSSL source and docs it seems pretty clear that the RSA struct will be on the heap somewhere.
Matt, my biggest problem with CloudFlare by a country mile is the ambulance chasing you guys do in your marketing and your penchant for inserting yourself in the story when you're not even involved. Here, you've done it quite obviously: you got predisclosure and you took the marketing opportunity at the expense of security on the public Internet. You were wrong. But you were the worst kind of wrong: you were wrong in a hurry to get your name in front of everybody first. If you had waited, you wouldn't have been wrong, and we would have been able to answer the question regarding keys without the disinformation.
It really pissed me off because I developed the ability to get keys long before you even wrote the post. I commented about it here several days before you wrote the post[1]. After your blog post, I was accused of fabricating the entire story because you said keys were unobtainable. I cannot, legally, release code without opening myself up to legal ramifications for reasons I won't get into here. Then, after your post, people I've known for a long time accused me of making the entire thing up for a "shot at glory." Meanwhile, I had to explain that your blog post was not definitive to multiple people who were reassured by the false security.
I brought your company's marketing strategy up with you before on HN. Remember when your company jumped on nytimes.com getting owned at the registrar[0]? And you wrote a hurried postmortem of events (I'm assuming, from the typos) without even consulting the affected vendor, then went so far as to speculate on what happened at the affected vendor, and made sure your "postmortem" got on top of HN first?
> An e-mail obtained by Matther [sic] Key, an independant [sic] journalist, indicates that the hackers used a MelbourneIT domain reseller account as part of the attack. While we are only speculating at this point, it's possible that there was a vulnerability in Melbourne IT's reseller systems that allowed a privilege escalation.
You replied here on HN with "no good deed goes unpunished" after I expressed my displeasure with your company's behavior in that scenario, and you didn't really address my points. You basically pointed to the CTO of NYT's praise of your company as evidence enough that you did the right thing. You took advantage of the marketing opportunity (which is fine, I don't fault you) at the expense of allowing the affected vendor to even draft a postmortem or contact customers in a timely fashion (I fault you for that).
We get it. You want to position CloudFlare as a superhero company, capable of fixing the Internet problems that the rest of us cannot handle. However, your marketing strategy has alienated me from ever using your services, and I am not alone in that opinion. Please, rethink that strategy. Focus on the product instead of fixing what you perceive to be a broken Internet that only you can fix. That's the obvious tone I get from your marketing and choices of venue.
I'd actually released code to get private keys a day or two before Cloudflare wrote that blog post on the principle that it was easy enough to do that numerous other people probably already had in private. Didn't help; people still kept concluding that it was impossible to get the private key. Cloudflare were the most publicity-hungry by a little but they weren't the only offender; ErrataSec screwed up pretty badly too for example, as did Akamai and I think one or two others.
The last time I saw a company leverage marketing in this way was when Silent Circle shut down their 4 month old email product in a well-publicized 'stand of solidarity' with lavabit. People didn't really care as much about that one.
I think I'm fine giving the guy who found the bug a pass for his 140 character suggestion.
I'm not looking to collect pelts. I just think there was a better way to address the question of private key exposure than "The Cloudflare Challenge". In the future, maybe we can address serious questions with engineering instead of marketing stunts; how about the "let's all work to instrument OpenSSL" challenge?
My issue with the Cloudflare Challenge can be summed up in: no matter what the results are, it will give people a diminished impression of the bug's actual impact. I can't fathom any way in which the Cloudflare Challenge was beneficial to the security of their customers (or anyone else, for that matter), which should be the goal of bounties; not to simply be a PR move.
My problem is that if you're trying to figure out how key material is distributed throughout heap memory, asking people to answer that question about an unknown private key through heartbleed "peeks" is about the most obtuse possible way to find out.
Yeah, the whole thing left a vaguely unpleasant taste. It worked in this case, but now that the marketing genie is out of the bottle it's going to make security vulnerability intelligence harder to evaluate. IMHO.
No, he didn't. He said that it's unlikely, and he's right. It's unlikely that I'm going to get heads 100 times in a row if I'm flipping a coin, but it suddenly becomes likely if I try it a couple million times. Being unlikely doesn't mean you shouldn't assume that it's possible for a motivated attacker. Especially when the barrier to entry is so extraordinarily low, as we've seen repeatedly (Cloudflare and Akamai, I'm looking at you).
Fair point -- you're right that it's extremely configuration dependent. For many (especially in the Apache realm) it's trivial and very likely; others (especially Nginx) it's quite a bit more work. But even if he was 100% correct in it being unlikely, it still doesn't justify ignoring the threat, IMO.
They didn't find the bug. They benefited from a heads-up on the bug. Then they promoted a mistaken assumption about the bug, along with a gamified challenge site that was in fact a poor vehicle for investigating nginx+OpenSSL (how about, for instance, any decent debugger instead?).
Meanwhile, ops teams at big companies were in heated debates with security about whether keys needed to roll.
There are good people working at Cloudflare and I'm not part of any outrage battalion. I'm just not a fan of how they handled this particular incident.