Cryptology and Information Security

RSA applications of MD5 collisions lattices in cryptology RIES analysis formal privacy modeling traitor tracing


On the topic of RSA cryptanalysis I have supervised Ellen Jochemsz when she was working on her PhD Thesis.

applications of MD5 collisions

Here's a personal account of the history of the application of MD5 collisions to X.509 certificates.
part I - harmless colliding certificates
This line of work started shortly after Xiaoyun Wang took the crypto community by surprise at the Crypto Rump Session in August 2004, by presenting an MD5 collision. That led Arjen Lenstra and me to wondering what nasty things could be done with such collisions. We saw possibilities for interesting applications to X.509 certificates. Applications were already emerging in other areas (software integrity), using the exact collisions Xiaoyun Wang had presented at Crypto, that were using the standard initial IHV.
We had two main problems to solve: we had to neatly hide a random looking data block inside the highly formatted data structure that an X.509 certificate is, and we had to obtain collisions for different initial IHVs than Xiaoyun Wang's original collisions. The first problem was solved by observing that an X.509 certificate already contains a random looking block of data, namely the RSA modulus of the public key. It turned out to be easy to hide the collision inside this modulus, even while retaining the number theoretic structure of the modulus. The second problem turned out to be even easier: one e-mail to Xiaoyun Wang with our specific initial IHV was sufficient to receive the desired collision overnight.
See the old Colliding Certificates website containing a lot of technical stuff. This work led to our ACISP paper, and caused some turbulence in the PKI world.
One reaction that I like is the following, by Jeffrey Kay:
"I just finished reading Colliding X.509 Certificates (...) and I now have chills running up my spine (...) this is a pretty scary paper (...)"
Great. That's why we're doing this.
part II - more interesting colliding certificates
The type of certificates we were able to make using Xiaoyun Wang's type of collision violates fundamental PKI principles, but is completely harmless. This is mainly due to the fact that the owner's identity as present in the certificates must be identical in both certificates forming a colliding pair. This is because Xiaoyun Wang's method required the initial IHVs for the two colliding data blocks to be equal. We now call this type of collisions identical prefix collisions.
At EuroCrypt 2005, where Xiaoyun Wang presented her method for constructiong MD5 collisions, it became clear to us that this was the main obstacle for more interesting colliding certificates. In particular, we wanted colliding certificates with different owner identities. For that we needed chosen prefix collisions, where the two colliding data blocks have arbitrary chosen initial IHVs.
Then Marc Stevens, a master student in Eindhoven, came to me looking for a topic for his master thesis project, to be done within the Dutch National Communications Security Agency NBV. We gave him a choice from a few topics, and he chose hash collisions. After a few months he had considerably improved and accelerated Xiaoyun Wang's method for identical prefix collisions, and had ideas on how to construct chosen prefix collisions. This required massive computational efforts, that luckily can easily be parallelized and distributed. So Marc started the HashClash project, using the BOINC framework, in which volunteers donate their spare cycles to scientific advancement. Soon Marc had a few thousand PCs all over the world running for him, and after some 6 months he finally had a chosen prefix collision. This was crafted such that it could be built in a pair of colliding certificates, this time with different owner identities. See the Chosen-prefix Collision website and the Colliding X.509 Certificates for Different Identities website. Marc presented his work at EuroCrypt 2007, in a paper that was selected by the program committee as one of the three notable papers of the conference.
In June 2007 Marc graduated in Eindhoven as MSc, cum laude, with the highest possible grade for his master thesis. He received the Eindhoven University Best Master Thesis Award 2007.

The Dutch press:
-- The AutomatiseringGids published a news item "'Kraak' van SHA-1 dwingt tot aanpassingen" on their frontpage on February 25, 2005. It's online for subscribers only.
-- The TU Eindhoven weekly Cursor published a news item "Coding & Crypto kraakt MD5-certificaten" on March 10, 2005
-- The Dutch quality newspaper NRC Handelsblad published an article "Gebroken Codes" in its Science section of Februari 11, 2006, and as well online on its site for high school students.
intermezzo - predicting the 2008 US presidential elections
After graduating, before he went to work at CWI as a PhD student of Ronald Cramer, Marc visited Arjen Lenstra in Lausanne for two months. Arjen has a cluster of over 200 PlayStation3 game consoles available for cryptanalytic experiments. Marc developed code for this platform and considerably improved his methods for constructing chosen-prefix collisions for MD5. We now took a less serious but equally interesting and challenging application that was much more appealing to a broad audience: predicting who would become the next US president. Note: this was in the fall of 2007, more than a year before the elections in November 2008. Using only one PlayStation3 Marc produced a chosen-prefix multi-collision for MD5, consisting of twelve perfectly normal pdf files, all having identical MD5 hash values. See the Nostradamus website.
Moreover, as a somewhat more serious (to some) application we constructed a pair of Win32 executables having identical MD5 hash values. Next time you download some software and verify its integrity by an "MD5 checksum", do you trust that there is no other file with the same "checksum" value? See the Colliding Executables website.

These two websites were released on November 30, 2007, and attracted a lot of attention.
A few blogs:
-- wired, slashdot, reddit (1), reddit (2), infoworld, the register, (Spanish) and computable (Dutch).
The international press:
The Economist published in the Technology Monitor on their website a Special Feature item "Making a hash of it", on December 11, 2007.
A slightly modified version of this item, entitled "Making a total hash of it", has been published in print in the Technology Quarterly section of the March 8, 2008 edition.
Science mentions our work in an article Cryptologists cook up some hash for new 'bake-off' (accessible for subscribers only) in their NewsFocus section of Vol. 319, no. 5869, pp. 1480 - 1481, March 14, 2008.
The Dutch press:
-- The AutomatiseringGids (a Dutch weekly IT newspaper) published a news item on December 7, 2007: "Bescherming malware rammelt" (in Dutch). It's online for subscribers only.
-- The Dutch free daily newspaper De Pers (distributed at public places like railway stations) published an item "Hackers gaan PS3" (in Dutch) in the Digitale dingen column by Herbert Blankesteijn, on December 12, 2007. You can download (2.84MB) the entire newspaper, look at p. 18.
-- The TU/e weekly magazine Cursor has an article "Knappe voorspellingen - of toch niet?" (in Dutch), published December 20, 2007.
-- The local daily newspaper Eindhovens Dagblad has an article "TU/e is digitale notaris te slim af" (in Dutch) on December 28, 2007. I love their opening sentence: "De Weger's bookcase does not grow confidence in mankind". Unfortunately the corrections I proposed for this article did not come through.
-- The daily newspaper Trouw has an article "Digi-vingerafdruk is te kraken" (in Dutch), published January 7, 2008.
Thursday January 10, 2008, at 20:00, Marc Stevens and I, with Lex Borger from LogicaCMG as "independent specialist", have been interviewed by Herbert Blankesteijn in his radio program De Elektronische Eeuw (in Dutch) on the commercial news station BNR Nieuwsradio. An mp3 is available (mirror at TU/e: mp3). The mp3 file is about 13 MB, the total time is 22´42´´, our piece starts at 7´45´´, but it is certainly worthwile to listen to the previous piece as well as that is being referred to in our piece.
-- The Dutch Linux Magazine issue of February 2008 has an announcement "Nieuwe kraak MD5 maakt checksums onbetrouwbaarder".
part III - really dangerous colliding certificates
In July 2008 Alex Sotirov, Jake Appelbaum and David Molnar contacted us, as they had ideas on how to mount an attack on a real Certification Authority, based on chosen-prefix collisions, and they needed help on constructing these collisions. This led Marc to making new improvements in his methods, and the full strength of Arjen's PlayStation3 cluster was used to produce, in October 2008, the collision that was needed. The result was a potentially very dangerous attack possibility on the SSL Public Key Infrastructure, in that we had constructed a rogue Certification Authority whose certificate appears to be signed by a commercial CA trusted by the major browsers. This story has been extensively described on the rogue CA website. Alex, Jake and Marc presented this work on December 30, 2008 at the 25th Annual Chaos Communication Congress in Berlin.

It has attracted major media attention, including Newsweek (Feb 9, 2009), the Washington Post, New York Times, Le Monde, Swiss Television News, NRC Handelsblad, and a lot more. Some of the many blogs: Steve Bellovin, Ed Felten, Eric Rescorla, Bruce Schneier, Gene Spafford, Slashdot, Wired. In one blog I found a quote about our work that I like:
"their paper describing the attack is a fascinating read filled with enough gory details to make any security practitioner salivate".
Keep tissues at hand when you click here.

The US Computer Emergency Readiness Team issued a Vulnerability Note, and there were public reactions by companies such as Microsoft, Verisign and Cisco.
The Dutch Government Computer Emergency Response Team GovCert has issued a Factsheet (in Dutch) and mentioned us in their Trendrapport 2009 (in Dutch).

The new developments on MD5 collision construction are described in detail in a paper that was presented at Crypto 2009. This paper also contains details of speed improvements to the identical prefix collision construction, and of a construction method for single block chosen prefix collisions. The Crypto 2009 program committee selected our paper for the best paper award.

Our work made it to number 1 of the Top Ten Web Hacking Techniques of 2009. If that's something to be proud of I don't know.
part IV - really really really dangerous colliding certificates in the wild
On June 4, 2012, only weeks before Marc's PhD defense in Leiden (June 19, with Ronald Cramer and Arjen Lenstra as "promotores", Xiaoyun Wang as one of the "opponent"s, and me as "co-promotor"), I saw on a mailing list a message about the Flame malware being authenticated by a forged certificate appearing to have been issued by Microsoft. The message contained a few certificates, but only Microsoft CA certificates, most of which had already been revoked. The forged certificate was not yet publicly available.

I alerted Marc to this immediately. Marc had, in his time at CWI, developed a very nice "forensic tool", that on input of one message only, checks whether this message has a twin message such that these twins form a hash collision constructed by a method such as Wang's/Marc's. I asked Marc to run his tool on the available Microsoft CA certificates (you never know what might come out), which he did, with negative results.

One day later, Microsoft announced in a blog that indeed the certificate used for authentication of Flame has been constructed by a collision for MD5. In that blog they explicitly refer to our previous work. The forged certificate also became available that day, and Marc ran it through his forensic tool, this time with positive results. Moreover, analyzing the results, Marc discovered that a novel chosen-prefix collision attack had been used. This is very interesting news, both from a scientific viewpoint, and because it gives some insight in the capabilities of the designers of Flame. On the other hand, the data from the certificate also suggest that the attack followed to a substantial extent the path we had laid out in our Rogue CA work; it gives the impression that the Flame designers carefully have studied our work, but gave it their own creative twist.

Further analysis has to be done to say more about the novel attack. I refrain from speculating who might be behind it. However, given the intricacies of this kind of work, they're extremely capable people. I can only hope those people have very good reasons for what they've been using their capabilities for. I'm not yet convinced of that.

And of course one can only wish that parties like Microsoft now finally stop using MD5 in issuing certificates. And, when phasing out MD5 anyway, my advice is to include SHA-1. If you want to know why, read Marc's thesis, and be reminded that Marc's thesis will not be the end of the developments in hash cryptanalysis. It's a matter of time for SHA-1 collisions to appear, and only of some more time for them to be made applicable (read: abusable).

In short: from a scientific viewpoint it's very interesting to see novel attack ideas appear. But I would have preferred them to have come from the open academic community or from the open security researchers community. Above all, I'm not very happy to see our work being abused in malware.

Marc's PhD thesis is available from
websites and papers I co-authored
websites papers magazine articles (in Dutch):

lattices in cryptology


RIES analysis

RIES is an internet voting system that has been used in The Netherlands for expats with national and european elections, and for Water Board elections. In 2008 the Union of Water Boards asked us to do an analysis of RIES. Our report played a role in the government's decision to not allow the use of RIES for internet voting in the 2008 Water Board elections.

Formal Privacy Modeling


Traitor tracing