At the Berlin Crypto this month we had a talk by Kevin about a shared-secret service they developed at Syselevn. After experimenting with PGP and deciding that it doesn’t do what they needed, they decided to go with a very simple, custom encrypt-then-mac scheme. You can find details here. When someone says they built their own encryption scheme and message format I get obviously curious. In this post I want to summarize the scheme, design decisions, compare it standard authenticated encryption schemes, and ponder the question of the right security definitions.
Note that I refrain from formal definitions in this post. I only want to give the intuition. Please read the linked documents for formal definitions and details. See for example the Authenticated Encryption paper by Bellare and Namprempre for details on authenticated encryption and the Authenticated Encryption with Associated Data by Rogaway on AEADs.
Shared-Secrets
The service and scheme are described as follows:
Shared-Secrets is an application that helps you to simply share one-time secrets over the web. Using the Shared-Secrets service allows you to transfer the actual secret in an encrypted form. Retrieving the secret is as simple as following a link. In contrast to other secret sharing services, Shared-Secrets does not store the secret on the server, but puts the encrypted secret into the link that you share with the desired recipient.
I leave aside the fact that any attacker that can intercept the message can trivially retrieve the secret and focus on the encryption scheme used for the service. Note that the service allows to encrypt the message with a password before sharing it to prevent the aforementioned attack.
The encryption scheme used for this service is a basic encrypt-then-mac scheme that can be written as
(c, t) <- EtM(key, nonce, msg, aad)
where aad
contains all meta-data (version etc.).
Keys for the mac and encryption algorithms are derived as follows
key <- random() # 32-byte
(k_c, k_t) <- (HMAC-SHA256(key, "enc"), HMAC-SHA256(key, "mac"))
rsakey <- RSA-OAEP(pk, key)
The RSA key pk
is the server’s public key.
Now we can compute the ciphertext and mac.
c <- AES-256-CTR(k_c, nonce, msg)
t <- HMAC-SHA256(k_t, aad, c)
The message format is defined as
[version:01][rsakeycount:02][rsakeyid:32][rsakeylength:02][rsakey:mm][...][rsakeyid:32][rsakeylength:02][rsakey:mm][nonce:16][message:nn][mac:32]
with rsakeyid := SHA256(pk)
and nonce := $(date +%s)|0..0
(8-byte unix timestamp, padded 8 zero bytes).
Observations
Looking at the scheme there are a two things that stand out. This is in addition to the unclear threat scenario that I won’t discuss in this post.
Note that there’s currently no standardized way of doing hybrid encryption. There’s currently an RFC in the making but until that’s finished it is necessary to define custom hybrid schemes as done here.
The nonce is not random
When using encryption schemes such as AES-CTR it is paramount that the nonce is unique. If this is not the case, the key-stream becomes known and allows full message recovery. Using a timestamp as nonce is usually a bad move as time is predictable and not random.
How is this different from an AEAD
The scheme as described above looks like an AEAD scheme. So the question is why didn’t they just use an established AEAD scheme such as AES-GCM or ChaCha20Poly1305.
Encrypt-then-Mac == AEAD?
Let’s talk about encryption and authentication first. Historically encryption offered confidentiality but no integrity. Something that’s not immediately obvious to non-cryptographers who want to use encryption to protect their data. Authenticated Encryption (AE) and its relation to non-authenticated encryption was first described in the previously mentioned paper by Bellare and Namprempre in 2000.
Encryption was considered secure if it satisfied the indistinguishable under chosen plaintext property, i.e. if an attacker couldn’t distinguish whether a ciphertext encrypts message a
or message b
.
But “security” of an encryption scheme intuitively might mean something else as well.
In addition to not being able to know which message is encrypted in a ciphertext it shouldn’t be possible for an attacker to change the content of the message or the ciphertext without the recipient noticing it.
This property can be described as ciphertext (or plaintext) integrity and allows the definition of authenticated encryption.
Adding associated data (AD) that is not encrypted but authenticated we get Authenticated Encryption with Associated Data (AEAD).
For a comprehensive overview of AEAD and its history I recommend checking out these slides by Phillip Rogaway.
Encrypt-then-Mac != AEAD
Generally encrypt-then-mac schemes can be considered AEADs. For this to be true however, the Mac has to be strongly unforgeable. Luckily this is the case for HMAC such that the scheme described above is fine in this regard.
A MAC scheme is weakly unforgeable (WUF-CMA) if an attacker is not able to generate a tag for a new message, i.e. a message that she hasn’t seen a tag for before. If a scheme is strongly unforgeable (SUF-CMA), it must me impossible for an attacker to generate a new message, tag pair, i.e. not only a new message but also a new tag. While SUF-CMA seems like an artificial extension of the intuitive notion of WUF-CMA it is necessary to build an AEAD as WUF-CMA doesn’t provide integrity guarantees for the ciphertext or plaintext.
AEAD Properties
The motivation to build this custom AEAD scheme instead of using an existing one was a big point of discussion at the meetup. The essence is that Kenny didn’t trust that an AEAD provided exactly what he needed. So what are properties of an AEAD? Generally AEADs are supposed to offer IND-CCA security where the attacker is allowed to decrypt arbitrary ciphertexts. Instead of thinking of AEAD as IND-CCA secure it is more intuitive to think of it as IND-CPA secure (i.e. the attacker can’t do better than guessing which message is encrypted in a ciphertext when given the ciphertext to one of two adversarially chosen messages) and offering INT-CTXT, i.e. integrity of the ciphertext. Note that this does not imply authenticity of the plaintext because IND-CCA does not guarantee that it is impossible for an attacker to generate a valid ciphertext for a specially crafted message.
Uniqueness of Tags
The property that was questioned to be part of the AEAD security definition but was important to the Shared-Secrets service is the uniqueness of the ciphertext and tag. This is necessary to make sure secrets can only be retrieved once. According to the website the server uses a fingerprint to achieve this. First I notice that fingerprint is not defined in the encryption scheme. Looking at the code it appears that the fingerprint is the tag. The fingerprint is stored on the server and the service refuses to decrypt anything with the fingerprint if it did so once before.
So the question is whether the used encrypt-then-mac scheme makes sure that the tag is unique and whether an AEAD would have offered this property as well.
In other words, can an attacker generate a second, different, valid tag t'
for an existing ciphertext, tag pair (c, t)
.
Note that I do not consider encodings here.
(Using base64 encoding for example it is possible to generate two different base64 strings that decode to the same binary message.)
To this end there has to exist a pair (c, t), (c', t')
with Dec(c) == Dec(c') && Verify(t) == Verify(t')
i.e. two ciphertext, tag pairs that decrypt to the same message and both tags are valid.
First note that the unforgeability of the Mac ensures that it is impossible to generate a valid tag t'
for a given ciphertext c
without knowledge of the key.
It follows that the ciphertext c'
has to be different from the original ciphertext c
.
But without knowledge of the message m
encrypted in c
(or the key) it is impossible to generate a ciphertext c'
that decrypts to m
.
Hence this is not possible.
While everything else in this post were well established properties of AEADs this one doesn’t appear to follow trivially. However, this applies to the custom scheme used in Shared-Secrets as well as an off-the-shelf AEAD.
Timestamps as Nonces
Non-random nonces break most AEAD schemes. This is one reason Misuse-Resistant AE (MRAE) was introduced and specified for schemes such as AES-GCM (AES-GCM-SIV).
So how does choosing a timestamp as nonce fare in the Shared-Secret scheme?
Recall that AES in counter mode, as used here, is an XOR of a plaintext block with the AES encryption of the concatenation of the nonce and the counter (c_i = m_i xor AES(key, nonce||ctr)
).
Thus, one can recover an unknown plaintext by computing the xor of its ciphertext with the xor of a known ciphertext, plaintext pair (m_i = c_i xor (c'_i xor m'_i)
) when the nonce is known.
While the Shared-Secrets scheme is very fragile here and not well-designed here (why not just take a random nonce?) I can’t see how this can be exploited. Even if an attacker is able to generate two ciphertexts with the same nonce (this can be easily done by sending two requests to the server at the same time), the key will be different in both cases.
Discussion: How to define security?
This episode illustrates how important it is for security definitions to model real-world scenarios that people actually have when using a primitive. But it also shows that sometimes, even though standard primitives with appropriate security definitions exist, communicating these properties fail. The use case of the Shared-Secrets is a peculiar one but highlights these issues. Something I did not expect is that people rather define their own crypto schemes than using existing ones because they think they don’t understand the properties properly. But this is all the more reason to make sure the exact security properties a scheme offers are communicated and well understood by everyone.
For me this is another pointer that it is important to have properly working communication channels between people analyzing crypto and everyone using it in order to transport requirements and make sure everyone is on the same page. And this is, among others things, what we do at the Berlin Crypto meetup in a very informal and local setting.