Selfie's Reflections on Formal Verification for TLS 1.3: Largely Opaque

Selfie is a new attack on TLS 1.3 that was published last week by researchers Nir Drucker and Shay Gueron. The paper’s name is inspired by its core finding:

  1. In TLS 1.3, as in some previous versions of TLS, it’s possible to authenticate a session using a Pre-Shared Key, or PSK. A PSK is simply a 256-bit secret symmetric key that’s agreed upon beforehand by the two parties. Since only the Client and the Server know a specific PSK, it’s safe to assume that they can authenticate one another using it. We call that property mutual authentication.
  2. If the Client and Server in a TLS 1.3 session use PSK authentication with a certain type of PSK, an attack appears which breaks mutual authentication: specifically, the Client could be led to believe that it’s talking to the Server, whereas it’s actually talking to itself. This is because both the Client and the Server are authenticated using the same exact PSK, which, in order for this attack to work, has to be one that neither of them specifically generated (i.e. an “external” PSK, which is a PSK that was distributed out-of-band to the Client and Server and for which the generation was not under the specific control of any party).

A Client talking to itself could be seen as having its own messages being “reflected” back at it, hence the name “Selfie”. Here’s a useful sequence diagram from the paper that illustrates what the attack looks like:

Selfie attack sequence diagram.

The Selfie authors then suggest three ways for preventing this kind of attack across all sorts of secure channel protocols that could potentially be affected. Their preferred way is to generate a unique identity for every principal, make sure that identity is known across the network, and then have each pair of principals mix their identities into each session they end up establishing. It’s not a cheap fix: guaranteeing identity uniqueness isn’t trivial, and authenticating identities, especially out-of-band, has long been a troublesome component of secure communication systems.

For TLS 1.3 specifically, however, their mitigations are easier to apply: the Selfie authors recommend using external PSKs only in conjunction with server certificates. A better solution, they continue, would also be to simply forbid the sharing of a specific PSK between more than one Client and Server. I’m personally pessimistic about how enforceable the latter recommendation would be in real-world deployments. But the former is fine.

Now, I’d like to get this out of the way before we continue: I would argue that Selfie has a very limited impact on the practical security of TLS 1.3 in a real-world setting, due simply to the fact that Clients are very rarely seen having fruitful TLS sessions in which they talk to themselves. Once the Client’s message is reflected back unto itself, the most likely thing that will happen is that the session data stream will be considered corrupt by the application layer, and the entire session will fall over.

However, Selfie remains a perfectly valid attack on TLS 1.3’s PSK-based mutual authentication logic. It’s fair to say that it should have been caught earlier by the TLS 1.3 WG and dealt with appropriately.

Formal verification efforts for TLS 1.3 didn’t catch Selfie…

At first glance, Selfie does indeed seem to fly in the face the unprecedented effort to formally verify TLS 1.3 as secure.

Formal verification has traditionally meant hand-written, game based proofs. But in recent years it has increasingly become a methodology where automated, tool-assisted verification is accomplished by describing protocol flows under either the symbolic model (e.g. using ProVerif/Tamarin) or the computational model (e.g. using CryptoVerif).

In the case of symbolic verification can allow for the rapid evaluation of real-world protocol flows against specific security queries under an active attacker (“is the authenticated key exchange forward secure?” “is the first message confidential?”). In the computational model, full game-based proofs can still be obtained, similarly to hand proofs, while being more easy to link to specific protocol flows across well-defined principals.

TLS 1.3 received formal verification attention of all sorts: hand-written game-based proofs, automated symbolic verification, computational model proofs, and even a gargantuan effort by Microsoft Research called Project Everest which aims to build an entirely functionally correct and formally verified HTTPS stack.

And yet in spite of all of this unprecedented effort to prove the security of TLS 1.3, Selfie still happened! As the Selfie authors point out in Section 6 of their paper:

The Selfie attack comes as a surprise, since TLS 1.3 is the first TLS version that was designed alongside with security proofs, e.g., the studies in {3, 5, 6, 7, 8, 9, 12, 13, 14}. This is even more surprising since the Selfie attack is actually a manifestation of a reflection attack that is already known in other contexts.

(Full disclosure: I’m a co-author on {3}, cited above.)

The Selfie authors then delve into existing work on formally verifying TLS 1.3 to try and find out why Selfie wasn’t spotted. In summary, they correctly conclude that the protocol flows that were envisaged by the formal verification efforts for TLS 1.3 weren’t comprehensive enough to catch the case where a Client uses TLS 1.3 with an external PSK and then proceeds to talk to itself for the entire session.

…because these formal verification efforts were geared towards eliminating far more damaging classes of attacks.

Selfie is a legitimate and interesting attack on TLS 1.3. The Selfie paper is written with care and attention to detail. It’s a finding that deserves to be published.

But I think it’s important to explain why Selfie’s existence doesn’t meaningfully reflect on the status of the formal verification efforts for TLS 1.3.

Selfie is a finding with very limited practical implications and that relies on a protocol flow for TLS 1.3 (a Client talking to itself) that, in real-world usage, would almost certainly immediately lead to a corrupt data stream. When the formal modeling of TLS 1.3 was moving on with full steam, the security community was still reeling from the disclosure of protocol bugs and vulnerabilities that belong to entire classes of attacks that are far more serious than the one which Selfie finds itself in.

Forget a Client being misled into talking to itself: the focus was on ruling out different classes of bugs that were comparatively critical in nature:

  • Logjam, which hurried the radical reduction of supported cipher suites in TLS 1.3. This class of attacks was then subsequently ruled out using formal verification efforts.
  • Triple Handshake, which led to significant changes to mutual authentication in TLS 1.3 that were then formally verified.
  • SMACK, which inspired the radical reduction of the state machine space in TLS 1.3 from TLS 1.2, which was then formally verified.
  • SLOTH, which led to the urgent removal of deprecated cryptographic primitives from TLS in TLS 1.3 as well as in other unrelated protocols.
  • POODLE and Lucky13: attacks caused by weak authenticated encryption which were, again, modeled and formally analyzed in order to directly inform the design of TLS 1.3.

These are the sorts of disastrous attacks that formal verification for TLS 1.3 was paying attention to ruling out: total session compromise caused by ill-defined protocol state machines with a way-too-huge state space. Protocol failures caused by weak hash functions, Diffie-Hellman groups or long-term keys.

The list goes on. Every bullet point on that list was linked to a critical weakening in the real-world security of TLS, and this is the sort of dangerous stuff that formal verification was focusing on ruling out in TLS 1.3. Indeed, some of TLS 1.3’s formal verification efforts were able to rule out attacks such as Logjam or Triple Handshake even if TLS 1.3 was being run in parallel with TLS 1.2!

It’s easy to decide you’re justified in deriding formal verification efforts for TLS 1.3 when you’re not clear on what they were actually intended to rule out. Whenever a formal verification result is published, a few things are always clearly mentioned: the protocol flows being verified, the classes of attacks being checked for and ruled out, and the remaining protocol flows that are outside the scope of the accomplished verification effort.

Every one of the formal verification studies cited by the Selfie authors had this sort of diligent scoping written out for its formal analysis. None of them claim to rule out an attack like Selfie’s, because, given the much more serious classes of attacks the modeling was focused on, none of them produced a formal model for it. As such, Selfie’s appearance can’t really be held as a barometer by which to evaluate the accomplishments of TLS 1.3’s formal verification efforts.

Our models could still have been more comprehensive.

That being said, it’s still fair to recognize that, at least when it comes to our own formal verification efforts for TLS 1.3, we still could have produced models that were more comprehensive and that better captured the external PSK protocol flows. In our work, we modeled TLS 1.3 session flows in both the symbolic and computational model, using ProVerif and CryptoVerif. Here’s where we lacked precision sufficiently such that Selfie wasn’t detected:

  • Our symbolic models were written such that mutual authentication via some PSK was considered “correct” if that PSK could establish mutual authenticated only between the Client and the Server to which it belonged. If it authenticated the Client to the Server, that was “correct”. But if it authenticated the Client to the Client, that was considered “correct” too.
  • Our computational models were written such that they exclusively capture protocol flows where each PSK would be shared only between one Client and one Server, thereby inadvertently implementing a mitigation for Selfie that did not exist in the actual TLS 1.3 specification that we were evaluating.

While it’s true that formal verification for TLS 1.3 would have done better to be comprehensive enough to rule out Selfie, it shouldn’t come as a huge surprise that it didn’t. In fact, one could even argue that formal verification is the reason why attacks on TLS 1.3 protocol flows have been restricted to scenarios like Selfie, i.e. flows that would simply never occur in a real-world setting. At least when it came to our own research, however, we were focused on making sure that the catastrophic bugs that have plagued TLS 1.2 and prior can be, this time, hopefully steered clear from.

On Encryption and Terrorists

This is a repost of something I wrote in November 2015, following the terrorist attacks in Paris. It was subsequently translated to French and published as an op-end in Le Monde.

In light of the recent terrorist attacks, things are getting heated for the regular security and encryption software developer. Being one myself, I’ve been on the receiving end of a small avalanche of requests from journalists, political pundits and even law enforcement. I’m also someone who was born and raised in Beirut and who recently immigrated to Paris, both of which were the sites of twin attacks, one day apart from each other.

It seems necessary to share some perspective on what’s going on with encryption software, the terrorists supposedly using it, and what this means for the rights and the security of our global communities.

The encryption software community writes a large variety of software, from secure instant messaging to flight tower communication management to satellite collision prevention. We do this for a number of reasons, but there’s always an underlying shared understanding: that we’re using mathematics and engineering to contribute towards a society that’s safer, more capable and able to communicate with a sense of privacy and dignity inherent to all modern societies.

The premise driving the people writing encryption software is not exactly that we’re giving people new rights or taking some away: it’s the hope that we can enforce existing rights using algorithms that guarantee your ability to free speech, to a reasonable expectation of privacy in your daily life. When you make a credit card payment or log into Facebook, you’re using the same fundamental encryption that, in another continent, an activist could be using to organize a protest against a failed regime.

In a way, we’re implementing a fundamental technological advancement not dissimilar from the invention of cars or airplanes. Ford and Toyota build automobiles so that the entire world can have access to faster transportation and a better quality of life. If a terrorist is suspected of using a Toyota as a car bomb, it’s not reasonable to expect Toyota to start screening who it sells cars to, or to stop selling cars altogether.

And yet, this is the line of questioning that has besieged the cryptography community immediately after the Paris attacks. A simple mention of my encryption software in an Arabic-speaking forum is enough to put me on the receiving end of press inquiries such as “are you aware of any terrorists using your software? Do you feel it’s your responsibility to monitor terrorist activity?” Or, more bluntly, do I feel like I’m complicit in aiding terrorists, by the simple fact that I write cryptography software or currently do PhD research in applied cryptography? The brouhaha that has ensued from the press has been extreme. I’ve received calls that bluntly want to interview me regarding “technology used by terrorists, such as yours.” A Wired article, like many alongside it, finds an Arabic PDF guide on encryption and immediately attributes it as an “ISIS encryption training manual” even though it was written years ago by Gaza activists with no affiliation to any jihadist group.

In this rush to blame a field that is largely unknowable to the public and therefore at once alluring and terrifying, little attention has been paid to facts: The Paris terrorists did not use encryption, but coordinated over SMS, one of the easiest to monitor methods of digital communication. They were still not caught, indicating a failure in human intelligence and not in a capacity for digital surveillance.

But even in light of all the evidence pointing towards a human intelligence failure, cryptography, being to the outsider a scary and mysterious usage of secret codes and complicated algorithms, remains an easy target. The press again drives the discussion, each time with a lessened priority for measured questioning and proper investigation. Why haven’t you inserted back doors into your software? Do you want terrorists to use your tools? The call for backdoors is nothing new. During my career in the private sector, I’ve been asked to backdoor encryption software so as to please potential investors, and have seen colleagues who appeared to stand for secure software balk under the excuse of “if that’s what the customer wants,” even if it results in irreparable security weaknesses. I’ve had intelligence officers ask me informally, out of honest curiosity, why it is that I would refuse to insert backdoors.

The issue is that cryptography depends on a set of mathematical relationships that cannot be subverted selectively. They either hold completely or not at all. It’s not something that we’re not smart enough to do; it’s something that’s mathematically impossible to do. I cannot backdoor software specifically to spy on jihadists without this backdoor applying to every single member of society relying on my software.

And I’ve seen what guarantees secure communication can give a society. I’ve seen my software used in Hong Kong to organize protests against a government otherwise unwilling to give people their rights. I’ve seen my colleagues produce software used by Egyptians rallying for democracy. I’ve had childhood friends call me from Beirut, desperate to know of a way to organize protests against a government that would lock them up were they to use public phone lines. I’ve set up communication lines for LGBTQ organizations so that they can give counsel without fearing ostracization or reprisal. And in the comfort of my new life in France, I’ve also relied on encryption so that I know I’m obtaining my simple right to privacy when discussing my daily life with my friends or with my partner.

I’ve come to see encryption as the natural extension a computer scientist can give a democracy. A permeation of the simple assurance that you can carry out your life freely and privately, as enshrined in the constitutions and charters of France, Lebanon as well as the United States. To take away these guarantees doesn’t work. It doesn’t produce better intelligence. It’s not why our intelligence isn’t competing in the first place. But it does help terrorist groups destroy the moral character of our politics from within, when out of fear, we forsake our principles. If we take every car off the street, every iPhone out of people’s pockets and every single plane out of the sky, it wouldn’t do anything to stop terrorism. Terrorism isn’t about means, but about ends. It’s not about the technology but about the anger, the ignorance that holds a firm grip over the actor’s mind.

I grew up and spent a decade of my childhood in south Beirut and was literally neighbors with the security sector of Hezbollah, a guerilla organization that fights frequent wars with Israel. During the 2006 war, an Israeli fighter jet carpet-bombed the entire neighborhood, razing my home and that of many others to the ground. While walking through a field of rubble and unexploded cluster bombs to try and find my house, I distantly saw a friend of mine, far away on the other side of whatever it was that I was staring across. We locked eyes. Then, we burst out laughing. We laughed for a long time.

In 2008, I got the opportunity to move away from Lebanon and to get an education abroad. This opportunity was rare and unusual. Making encryption software is hard, too: for many of my first years abroad, much of my software was riddled with bugs, and it took practice and feedback in order to start getting things right – namely, what my education abroad of Lebanon really was about.

Visiting south Beirut a few years later, I found that I had changed but that no one else there had. The rubble was mostly but not completely gone. I also found that people were angry, and that Hezbollah had pledged to rebuild their homes. Left without any hope for a good education, for a happy life, with much of their families missing, with their friends dead, many pledged themselves in return.

That’s what’s causing terrorism, not encryption software.

Repairing a ThinkPad with a Corrupt Thunderbolt Firmware Chip

Update (Nov. 27, 2018): Lenovo has started rolling out BIOS updates that claim to fix this issue for the ThinkPad P1 as well as other ThinkPads.

I recently completed an exciting repair on my new ThinkPad P1. The repair seemed, to me, almost impossible, but due to the encouragement of a kind and formidably skilled friend, I was actually able to pull it off.

It also appears to be the first time this specific issue was ever successfully repaired, so I feel compelled to share my findings in detail.

The Problem

There is a known problem with the high-end ThinkPads of the past couple of years that for some reason Lenovo has not addressed fully. The problem has been reported by a bunch of Lenovo customers. It was also just today covered in a mainstream laptop blog for what I believe is the first time under the headline “Some recent ThinkPads can be destroyed by changing a UEFI-BIOS setting.”

The problem is essentially the following:

  1. In modern ThinkPads, there exists an option in the BIOS called “BIOS support for Thunderbolt” or “Thunderbolt BIOS Assist” (depending on the model.)
  2. Enabling this option will brick your ThinkPad: whenever your ThinkPad is booted, it will immediately show a black screen with a single white (non-blinking) underscore at the top left. The system never seems to POST and no action is possible aside from shutdown.

This very Monday, I received my new ThinkPad. And like many before me, I went into the BIOS and thinking that this option “sounded potentially useful” I turned it on. I certainly didn’t expect it to completely brick my brand new computer.

I spent the next twelve hours trying to fix my ThinkPad, eventually wringing my hands and documenting everything I tried inside this miserable Reddit post. Nothing worked and nobody was able to help. The unanimous consensus was that this issue was impossible to repair and that I needed to RMA my ThinkPad for a new one.

Getting the Motivation

I was really keen about this ThinkPad and I didn’t want to just RMA it, despite the apparent intractability of the problem.

I knew that the issue most likely lay in a bricked BIOS. I mentioned this to my friend Jason who had a simple response: just find the EEPROM chip on which the BIOS is stored, hook up to it using electrical wires and re-flash it with a stock BIOS ROM from Lenovo’s website!

This initially seemed very unlikely to work and even mildly crazy. Sure, I could easily spot an EEPROM chip on a motherboard, but:

  • Would I even be able to hook onto the EEPROM chip? (The answer turned out to be easier than expected: a $15 SOIC8 clip that Amazon was able to deliver that same evening!)
  • What would I hook the EEPROM chip into? I don’t exactly have EEPROM programming interfaces just lying around. (It turns out, I did, in the form of my Raspberry Pi, which has GPIO pins!)

I still had to run around Paris all day to find electrical wires, though, which I eventually did at St. Quentin Radio, a great little shop with a Mitch Altman-esque spirit to it.

Things began to look up. I opened up my ThinkPad chassis:

ThinkPad P1 logic board.

With Jason’s help, I quickly spotted two SPI-type EEPROM chips. They’re both so tiny that it was hard to photograph the model number! The first one (a “big” EEPROM with 32MB of memory) was right under the WiFi adapter:

MXIC 25L256.

The second (“small” EEPROM chip with 1MB of memory) chip was right above the green SSD:

MXIC 25L800

One of these two chips must be the one holding the BIOS, unless I was unlucky enough (and unless Lenovo was dumb enough) for there to be EEPROM chips on the back of the logic board. A quick Google search gave me the information I needed to understand how to plug into the chips:

MXIC 25L800 guide.

Deep Cuts

A crucially useful guide by Tom van Veen taught me how to interface properly with the GPIO pins so that I’d have the Raspberry Pi essentially functioning as an SPI interface. I won’t repeat his guide here, since he already documents the procedure so well. Soon, this is what my desk looked like:

Raspberry Pi with SOIC8 clip.

Raspberry Pi plugged into ThinkPad P1.

The first thing I did was use Flashrom to read the contents of both the big and small EEPROMs onto my Raspberry PI’s microSD card for safe-keeping:

Flashrom on Raspberry Pi.

I then used UEFITool to try and understand what the chips held. This is what UEFITool gave me for the big ROM:

Big ROM viewed in UEFITool.

That made sense. The ROM was separated into multiple regions covering the Gigabit Ethernet firmware, something related to the Intel Management Engine, some odd “DevExp2” and “EC” regions and finally the BIOS region, which came in at an offset of 0x1400000. Cool!

UEFITool was much less descriptive about the small ROM (and the strings I was able to pull out of it using the strings command seemed to all be pretty random), but that was okay, since I was pretty sure by then that the big ROM held the BIOS:

Small ROM viewed in UEFITool.

I now needed to find a “fresh” BIOS image and hope that simply reflashing it would work out. I downloaded the BIOS update tool for the ThinkPad P1 from Lenovo’s website and eventually figured out that I could use UEFITool to extract a fresh BIOS image from inside the .FL1 file that the installer extracted onto disk to be flashed by the accompanying Windows exectuable (an .FL2 file contained what appeared to be the “EC Region” binary.)

(A small note: For some reason, the .FL1 file contained two exactly identical copies of the BIOS stored right after each other in UEFI format. I was never able to understand why and did confirm that the two BIOS regions within that file were completely identical. Oh well.)

I extracted the fresh BIOS and created a modified version of the “big” ROM where the “fresh” BIOS was injected started at 0x1400000. Then it was time to flash the modified BIOS back into the big EEPROM chip:

SOIC8 clip plugged into ThinkPad P1 (close-up).

Lo and behold, the computer did something! I got some kind of bleeping and blooping going on, and this would have been the end of it were it not for an unexpected obstacle: the ThinkPad detected that the BIOS had been messed with and was failing a CRC check. It would have allowed me through anyway, but because I had set a supervisor BIOS password before this entire story, it would not allow that security reset to happen despite my re-entering the correct supervisor password. I suspect that if you ran into this same issue but hadn’t set a supervisor password, you’d probably have been fine at this point: I was thwarted by an unlucky tangential circumstance and was ultimately not able to get past the BIOS’s self-test in order to boot.

Things looked pretty hopeless at this stage, but I pressed on. The first thing I did was flash the original backup I had of the big ROM back onto the chip. This was successful and I was again stuck at the black screen with a single unblinking underscore. Sigh.

The Moment of Revelation

But, wait a minute! Didn’t the problem start when I modified something having to do with the Thunderbolt BIOS settings?

And wasn’t the smaller EEPROM chip located right next to the Thunderbolt ports?!

Moment of Great Revelation.

Suddenly it all hit me. I picked up my SOIC8 clip, plugged it into the small EEPROM and tried something crazy: I created a 1MB ROM image filled with just zero bytes and went ahead and completely zerofilled the smaller chip.

Everything worked. My ThinkPad booted. I was back on Windows. I ran a full Lenovo hardware diagnostic and everything was working perfectly. I literally jumped for joy. It was an incredibly happy and genuine Eureka moment. It felt like the day I fell in love with computing. I mean it.

I later discovered that the Thunderbolt ports weren’t working, but the solution for that was at that point dead obvious. I went and downloaded the ThinkPad P1 Thunderbolt firmware from Lenovo’s website. I had to flash it manually using the Raspberry Pi, since the chip was so dead that the bundled Windows executable wasn’t even detecting it! But when I did, everything worked perfectly again: I was able to transfer files over Thunderbolt as well as connect displays and chargers (I had to pad the firmware I extracted from the installer with zero bytes before flashing it.)

Update (Nov. 6, 2018): I can confirm that first zerofilling the Thunderbolt firmware chip and booting is necessary to fix the problem. Flashing the original Thunderbolt firmware before zerofilling the Thunderbolt chip and booting will not fix the issue.

Hear me, Internet! There is no need to RMA your ThinkPads! This problem is solvable!

I would have never had this adventure if it weren’t for Jason’s encouragement. I strongly recommend you check out his work on WireGuard, which I already teach a whole session on at my NYU computer security course.