Recipient-Hiding and Related Properties for Public Key Encryption

Unfinished draft

David Hopwood <hopwood@zetnet.co.uk>
WWW: http://www.users.zetnet.co.uk/hopwood/
PGP key fingerprint: RSA 2048 / 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01

Abstract

Recipient-hiding is a property of public key encryption schemes that allows all information as to the public key of the recipient to be hidden from observers.

Extensions to this idea also allow public key ciphertexts to be indistinguishable from random data, or for multiple public keys to correspond to a single private key, without allowing the public keys to be linked. These properties have applications to steganography and anonymous communication. A formal security model for these properties is described, together with several schemes based on Diffie-Hellman and RSA that are provably secure in this model.

Recipient-hiding

A recipient-hiding public key encryption scheme has the property that a ciphertext does not reveal any information about which public key was used to encrypt (and hence which user is the intended recipient of a message).

As an example of a scheme that is not recipient-hiding, consider RSA: each user U_i has a public key pk_i = (n_i, e_i) such that all the n_i are distinct ¹. Encrypting with pk_i gives ciphertexts that, for the purpose of this attack, can be treated as being uniformly distributed random integers in the range 0..n_i-1.

If two public keys have moduli n₁ and n₂ that are not too close together, the corresponding ciphertext distributions will be distinguishable, as shown in Figure 1. If moduli are chosen at random (as they normally would be for standard RSA), then given sufficient messages encrypted to a given unknown key, and a set of keys that could have been used, an attacker can determine with high probability which key was actually used (and can rule out some of the other keys with certainty).

[Figure showing two distributions as lines of randomly spaced dots - one of length n1, the other of length n2]
Figure 1: Distributions of RSA ciphertexts for moduli n₁ and n₂.

Similar attacks are possible for discrete log-based cryptosystems that use a different modulus (or more generally, a different finite group or subgroup) for each user. There seem to be two possible approaches to preventing these attacks:

keep different (or potentially different) parameters for each user, but pass the ciphertext through a function that modifies the distribution so that it cannot be distinguished between users. This is the approach used by SRH-DHAES and SRH-RSA-OAEP, described later.
use the same group parameters for each user. This approach is simpler, but can only be used for discrete-log-based schemes. It is used by RH-DHAES, BRH-DHAES, and BSRH-DHAES.

In order to prove that a particular scheme is recipient-hiding, we need to have a formal model of precisely what that means. The model we will use takes into account adaptive chosen plaintext ² and adaptive chosen ciphertext attacks. We have chosen to use DHAES ("Diffie-Hellman Authenticated Encryption Scheme") as the basis for our discrete-log-based schemes, because DHAES is efficient, flexible in terms of the group to be used, and has security proofs that can be easily adapted for our purposes.

¹ If different users share an RSA modulus, they can determine each other's private keys; see Fact 1 in [RSA20].

² Adaptive chosen plaintext attacks do not normally need to be considered explicitly for public key algorithms, because the attacker is assumed to have access to the public key. In the case of recipient-hiding, the security property we are testing is whether an attacker can determine which public key was used, and so this key cannot be given in advance. Instead, the attacker will be given an encryption oracle.

Steganographic recipient-hiding

Recipient-hiding encryption schemes work by ensuring that ciphertexts follow a distribution that is independent of the public key (as far as can be determined by a computationally bounded attacker). However, this distribution is not necessarily independent of the choice of encryption scheme. "Steganographic recipient-hiding" is a stronger property that requires the ciphertext to be indistinguishable from uniformly random data of a given length.

In the case of symmetric ciphers, the ciphertext is often assumed to "look random", and the ability to distinguish ciphertext from random (called a "distinguisher attack") is commonly taken to indicate a weakness in the cipher. For a block cipher in a standard chaining mode (e.g. CBC, CFB or OFB) or for an additive stream cipher, these are indeed valid assumptions, but more generally, the definition of a semantically secure symmetric cipher does not require that ciphertext be uniformly distributed. This also does not hold for typical public key ciphers.

The term "steganography" describes techniques that attempt to hide data by making it appear to be something else. A simple form of this would be to hide text in the low order bits of raw audio data, which would not result in any immediately obvious change to the recorded sound. (In practice, more sophisticated techniques are needed in order to prevent attacks that detect patterns in the low-order bits of real audio files, that would not be present in the steganographically modified files if these bits are simply replaced. Similar problems occur for images and other file types. Details are beyond the scope of this paper, but see [need ref].)

[[define target distribution]]

In many applications either the target distribution is uniformly random, or a uniform random distribution can be used as an intermediate step. In particular, any steganographic application that attempts to hide encrypted data will already need to be able to go from a uniform distribution to the target distribution. Without loss of generality, then, we only need to consider how to produce public key ciphertexts that are uniformly random; the problem of how to massage these into the required target distribution (if different from uniform) can be treated as a separate problem.

Public key blinding

Another potentially useful property for public key encryption is the ability to have several public keys corresponding to the same private key, in such a way that the public keys cannot be linked with each other. Suppose for example that Alice wants to send several messages anonymously (as opposed to psedonymously, i.e. without revealing that all the messages came from the same source). She also wants to be able to receive encrypted replies. There are several protocols in the literature ([Babel], [Nymserver]), that would allow Alice to send anonymous messages and receive replies, by using variants of David Chaum's mix-net protocol [Mixnet]. However, these do not easily allow encryption of the reply in the case where Alice is anonymous.

If Alice uses a single key pair and includes the public key in each message, then the messages can be linked; an observer is likely to infer that they were all sent by the same person [footnote: forgery doesn't matter in this case] Alice could use a separate key pair for each communication, but that would require her to maintain many different key pairs, and to try decrypting incoming messages with each of them. It would be far more convenient to be able to use a single private key, that corresponds to many unlinkable public keys. Key blinding allows exactly that.

Summary of schemes

Scheme Key blinding? Steganographic? Common parameters? Based on
RH-DHAES No No Yes DHP
SRH-DHAES No Yes Optional DHP
BRH-DHAES Yes No Yes Decisional DHP
BSRH-DHAES Yes Yes Yes Decisional DHP
SRH-RSA-OAEP No Yes No RSAP

Formal security modelling

The type of model we will use is based on the "concrete provable security" approach espoused by Bellare and Rogaway, e.g. [Concrete], [DHAES]. Readers who are already familiar with this may wish to skip to the next section.

[DHAES] explains this approach as follows:

Let S be an encryption scheme which makes use of a primitive P, and let A_S be an adversary which attacks S. To show the security of S one converts A_S into an adversary A_P which attacks P. Ideally, A_P should use the same computational resources as A_S and, with this investment in resources, A_P should be just as as successful in attacking P as A_S was successful in attacking S. This way "practical" attacks on P imply practical attacks on S, and so the assumed absence of practical attacks on P implies the absence of practical attacks on S.
To quantify how close to this ideal we come we define the success probability of A_P attacking P, we define the success probability of A_S attacking S, and then we give concrete formulas to show how A_P's computational resources and success probability depend on A_S's computational resources and success probability. These formulas measure the demonstrated security. By giving explicit formulas we make statements which are more precise than those that are given in doing asymptotic analyses of reductions.

As an example, consider the following definitions for a symmetric cipher:

Let SYM = (SYM.encrypt, SYM.decrypt) be a symmetric encryption scheme with key space KEY:
SYM.encrypt : KEY PT Coins CT
SYM.decrypt : KEY CT PT
and let A be an adversary.

SYM.encrypt is a non-deterministic function, which we model by defining Coins as the set of possible random inputs. Given a non-deterministic function

f : ... Coins ...,

the Coins argument may be omitted to denote the distribution given by applying f. The notation "x

_R f(...)" means that x is chosen at random from the distribution { f(..., coins) : coins

_R Coins }. [Note that in all of our security definitions, random number generators will not be modelled explicitly, and are assumed to produce output that is indistinguishable by an adversary from a uniformly random independent distribution. Modelling them explicitly probably wouldn't help much: we already know that the RNG has to be secure, and it would complicate the definitions considerably.]

The advantage of A in attacking the semantic security of SYM under adaptive chosen plaintext attack, is defined as:

Adv_CPA(A, SYM) = 2.Pr[ K _R KEY;
(P₀, P₁, state) := A.find(SYM.encrypt_K);
b _R {0, 1};
C _R SYM.encrypt(K, P_b);
A.guess(SYM.encrypt_K, C, state) = b] - 1

This definition of advantage corresponds to the following experiment:

choose a random key K.
call the adversary's function "A.find(SYM.encrypt_K)", which chooses two plaintexts (P₀ and P₁) that it will try to distinguish between. Here "SYM.encrypt_K" is an encryption oracle, i.e. the adversary is given the ability to encrypt arbitrary chosen plaintexts with the key K, without knowing what K is. The find function also returns a state for later use by the adversary.
choose one of the plaintexts, P_b, according to a fair coin toss (i.e. b = 0 or 1).
encrypt P_b using K to give C.
call the adversary's function "A.guess(SYM.encrypt_K, C, state)", which tries to guess which plaintext was encrypted.

The advantage of the adversary, Adv_CPA(A, SYM), is dependent on the probability of it succeeding in its guess (taken over the random choices in key generation and encryption). Since the adversary can achieve a success probability of 1/2 by random guessing, the probability is scaled by multiplying by 2 and subtracting 1, to give an advantage ranging from 0 (maximum security) to 1 (maximum insecurity).

It turns out that this "find-then-guess" definition, where the adversary chooses two plaintexts and has to determine which of them was encrypted, is equivalent to testing whether the adversary can find any information at all from the ciphertext (see [FindThenGuess]). Following [DHAES], we will also use a modified form of this type of definition for public key ciphers.

In an approach that only considered asymptotic security, we would define SYM in terms of some security parameter t, and say that SYM is secure (in this attack model) if t can be chosen to make Adv_CPA(A, SYM) negligably small (say less than 2^-t) for any A.

The security of SYM is defined by the function
InSec_CPA(SYM, t, , m, m') = max_A {Adv_CPA(A, SYM)}
where the maximum is taken over all adversaries A running in time at most t, asking queries which total at most bits, and whose outputs P₀ and P₁ have length at most m bits, and m' bounds the length of a SYM.encrypt-produced ciphertext whose plaintext is of length m.

A formal model of recipient-hiding and related properties

Let	PK be the set of public keys,
	SK be the set of private keys,
	PT be the set of possible plaintexts,
	CT be the set of possible ciphertexts,
	Coins be the set of possible random inputs.

A blinded-key asymmetric encryption scheme ASYM is modelled by a tuple (ASYM.encrypt, ASYM.decrypt, ASYM.keygen, ASYM.blind, ASYM.TIMES):

ASYM.encrypt : PK PT Coins CT
ASYM.decrypt : SK CT PT {INVALID}
ASYM.keygen : Coins SK PK
ASYM.blind : PK Coins PK
ASYM.TIMES : a set (normally a range) of non-negative integers

Iff n is in ASYM.TIMES, then it is valid for a public key generated by ASYM.keygen to be blinded (using ASYM.blind) n times, and the resulting public key used for encryption.

[For example, if a scheme EXAMPLE allows keys to be blinded any number of times, but an initial public key generated by EXAMPLE.keygen cannot be used for encryption, then EXAMPLE.TIMES will be the set of all integers >= 1.]

Correctness

A cipher is correct if it encrypts and decrypts information properly, i.e. (in the case of a public key cipher), a valid private key can be used to decrypt data encrypted under the corresponding valid public key. If the cipher ASYM supports key blinding, then this must hold when the public key has been blinded n times, for all n in ASYM.TIMES:

{ (sk, pk₀) = ASYM.keygen(coins) : coins Coins },
n ASYM.TIMES,
{ pk_i = ASYM.blind(pk_i-1, coins'_i) : i 1..n, coins'_i Coins },
P PT, coins'' Coins,
(ASYM.decrypt(sk, ASYM.encrypt(pk_n, P, coins'')) = P)

Message Privacy under Chosen-Plaintext Attack

The advantage of an adversary A in attacking ASYM under chosen plaintext attack is defined as

Adv_CPA(A, ASYM) = max { 2.Pr[ (sk, pk₀) _R ASYM.keygen;
pk_i _R ASYM.blind(pk_i-1), for i 1..n;
(P₀, P₁, state) := A.find(pk₀, ..., pk_n);
b _R {0, 1};
C _R ASYM.encrypt(pk_n, P_b);
A.guess(pk₀, ..., pk_n, C, state) = b] - 1
: n ASYM.TIMES }

Explanation

This definition is based on definition 6 from [DHAES] (but with slightly different notation).

The adversary's advantage is calculated based on the following experiment:

generate a key pair (sk, pk₀).
blind the public key pk₀ n times, to create a chain of n+1 keys, pk_0..n.
call the adversary's function "A.find(pk₀, ..., pk_n)", which chooses two plaintexts (P₀ and P₁) that it will try to distinguish between. The find function also returns a state for later use by the adversary.
choose one of the plaintexts, P_b, according to a fair coin toss (i.e. b = 0 or 1).
encrypt P_b using pk_n to give C.
call the adversary's function "A.guess(pk₀, ..., pk_n, state, y)", which tries to guess which plaintext was encrypted.

Unlike symmetric encryption, there is no need to give the adversary an encryption oracle, because arbitrary plaintexts can be encrypted using the public key (chosen ciphertext attacks will be considered later).

Note that the success probability is taken over the random choices in blinding, as well as the initial key generation and encryption. Adv_CPA(A, ASYM) is the maximum advantage of A against ASYM in an adaptive chosen plaintext attack, over all possibilities for n (the number of blinded keys in the chain), i.e. all values of n from the set ASYM.TIMES.

Note that it is important for the adversary to be given the whole chain of blinded public keys, rather than just the one that is being used for encryption. As a somewhat artificial, but concrete example to motivate this, consider modifying a proven-secure blinded-key encryption scheme as follows:

add to the public key of a generated key pair, a random DHAES public key (g, y) (for which the private key is not known).
add to ciphertexts produced by the scheme, the encryption of the plaintext using DHAES with public key (g, y).
when a key is blinded, set y := gy.

This is insecure if the adversary is given the "previous" public key in the chain (and therefore insecure in practice), but can be "proven secure" in a model where the adversary is only given the public key used for encryption - so such a model would obviously be inadequate.

As in [DHAES], the security of ASYM against adaptive chosen plaintext attack is defined by the function

InSec_CPA(ASYM, t, m) = max_A {Adv_CPA(A, ASYM)}

where the maximum is taken over all adversaries A running in time + code size at most t, and whose outputs P₀ and P₁ have length at most m bits.

Message Privacy under Non-Adaptive Chosen-Ciphertext Attack

The advantage of an adversary A in attacking ASYM under non-adaptive chosen ciphertext attack is defined as

Adv_CCA1(A, ASYM) = max { 2.Pr[ (sk, pk₀) _R ASYM.keygen;
pk_i _R ASYM.blind(pk_i-1), for i 1..n;
(P₀, P₁, state) := A.find(ASYM.decrypt_sk, pk₀, ..., pk_n);
b _R {0, 1};
C _R ASYM.encrypt(pk_n, P_b);
A.guess(pk₀, ..., pk_n, C, state) = b] - 1
: n ASYM.TIMES }

The security of ASYM against non-adaptive chosen ciphertext attack is defined as

InSec_CCA1(ASYM, t, q, , m) = max_A {Adv_CCA1(A, ASYM)}

where the maximum is taken over all adversaries A running in time + code size at most t, making at most q queries to its decryption oracle, these queries totaling at most

bits, and whose outputs P₀ and P₁ have length at most m bits.

The only difference from the chosen-plaintext-only case is that the adversary is given a decryption oracle (ASYM.decrypt_sk) in the find stage. This is based on definition 7 from [DHAES], adapted for blinded keys.

Message Privacy under Adaptive Chosen-Ciphertext Attack

The advantage of an adversary A in attacking ASYM under adaptive chosen ciphertext attack is defined as

Adv_CCA2(A, ASYM) = max { 2.Pr[ (sk, pk₀) _R ASYM.keygen;
pk_i _R ASYM.blind(pk_i-1), for i 1..n;
(P₀, P₁, state) := A.find(ASYM.decrypt_sk, pk₀, ..., pk_n);
b _R {0, 1};
C _R ASYM.encrypt(pk_n, P_b);
A.guess(ASYM.decrypt_sk, pk₀, ..., pk_n, C, state) = b] - 1
: n ASYM.TIMES }

The security of ASYM against adaptive chosen ciphertext attack is

InSec_CCA2(ASYM, t, q, , m) = max_A {Adv_CCA2(A, ASYM)}

where the maximum is taken over all adversaries A running in time + code size at most t, making at most q queries to its decryption oracle, these queries totaling at most

bits, and whose outputs P₀ and P₁ have length at most m bits.

Again, this is a straightforward adaptation from definition 8 of [DHAES].

Now we define new security properties for blinded-key encryption and recipient hiding (based on the same approach to modelling security):

Security of public key blinding

Informally, an adversary is able to break the security of the blinding function if it can determine with non-negligable probability whether two given keys are blinded versions of the same public key (or equivalently, that they correspond to the same private key).

Note that giving the hypothetical adversary the blinded public keys directly results in a stronger security property than giving the adversary only ciphertexts, because given the public keys, the attacks can generate its own ciphertexts corresponding to arbitrary chosen plaintexts.

However, we must be careful to account for all possibilities in the relationship between the public keys involved. In the figure below, X and Y are the public keys of two newly generated key pairs, A and B are keys obtained by blinding X and Y some number of times, and Z is a blinded version of either A or B. Loosely speaking, the adversary's advantage is based on the probability of it being able to guess which key Z was blinded from.

[Figure showing keys X, Y, A, B, Z as described above, with arrows between indicating blinding]
Figure 2: Example of relationships between blinded keys.

Only one possible case is shown in the figure; it is also possible for Z to be one of the original keys X or Y, if keys generated by ASYM.keygen can be used directly for encryption (i.e. when 0 is an element of ASYM.TIMES). In general, all valid combinations for the number of steps between X and Z, and between Y and Z must be considered; the overall advantage of the adversary is the maximum advantage for any of these combinations.

More precisely, for each combination, the adversary's advantage is calculated based on the following experiment:

generate two key pairs (skX, pkX₀) and (skY, pkY₀).
blind pkX₀ and pkY₀, nX and nY times respectively, to give two chains of keys pkX_0..nX and pkY_0..nY.
let pkZ be either pkX_nX or pkY_nY, according to a fair coin toss.
call the adversary's function "A.guess(pkZ, (pkX_0..nX-1), (pkY_0..nY-1))", which tries to guess which chain of keys pkZ belongs to.
(Note that nX or nY can be zero; in that case pkX_0..-1 and pkY_0..-1 are of zero length.)

Note that unlike the definitions for message privacy, there is no "find" stage. The reason why a "find" stage is needed for those definitions is that there could be a subset of messages for which a scheme is insecure, but which are a negligable proportion of all possible messages. This should count as a security weakness, and therefore it's necessary to allow the adversary to choose the messages that it will try to distinguish. In constrast, key pair generation and blinding do not require any non-random input, so there is no input that could be influenced by an adversary (assuming that the RNG is secure). "Weak keys" may be possible, but they are only a security weakness if they occur with non-negligable probability, and so it's sufficient to randomly choose the chains of keys that are to be distinguished in the experiment.

For this security property the adversary is not given a decryption oracle (i.e. cannot perform chosen ciphertext attacks). If a decryption oracle were allowed, then it would not be possible to prove the security of public key blinding for any scheme, because that would enable asking the owner of one of the private keys whether a message can be validly decrypted - which would immediately give away whether the public key used for encryption corresponds to that private key. I.e. this type of attack must be prevented at the protocol level, rather than by the encryption scheme. This should not be too much of a problem for protocols that are specifically designed to support anonymity.

[It is still useful for chosen ciphertext attacks to be considered for message privacy, since that protects the confidentiality of messages even when the anonymity of a recipient has been broken, or the mechanism for preventing chosen ciphertext attacks at the protocol level fails (i.e. it provides defense in depth).]

The advantage of an adversary A in attacking the public key blinding of ASYM is therefore defined as

Adv_Blind (A, ASYM) = max { 2.Pr[ (skX, pkX₀) _R ASYM.keygen;
(skY, pkY₀) _R ASYM.keygen;
pkX_i _R ASYM.blind(pkX_i-1), for i 1..nX;
pkY_i _R ASYM.blind(pkY_i-1), for i 1..nY;
b _R {0, 1};
pkZ = { pkX_nX, if b = 0
pkY_nY, if b = 1
A.guess(pkZ, (pkX₀, ..., pkX_nX-1), (pkY₀, ..., pkX_nY-1))
= b] - 1
: nX ASYM.TIMES, nY ASYM.TIMES }

The security of ASYM for public key blinding is

InSec_Blind(ASYM, t) = max_A {Adv_Blind(A, ASYM)}

where the maximum is taken over all adversaries A running in time + code size at most t.

Recipient hiding under Adpative Chosen Plaintext/Ciphertext Attack

A blinded-key encryption scheme is recipient-hiding, if given two blinded key chains, at the ends of which are two keys pkX and pkY, and oracles that encrypt chosen plaintexts or decrypt chosen ciphertexts with either pkX or pkY (always the same one), it is not possible to guess which key is being used for encryption, with probability significantly higher than 1/2.

The security definition for this property has two differences from that for key blinding:

the adversary is given oracles that encrypt and decrypt with the randomly selected key, rather than the key itself.
the blinded key chains include both of the end keys (pkX and pkY).

We use the following experiment:

generate two key pairs (skX, pkX₀) and (skY, pkY₀).
blind pkX₀ and pkY₀, nX and nY times respectively, to give two chains of keys pkX_0..nX and pkY_0..nY.
let (skZ, pkZ) be either (skX, pkX_nX) or (skY, pkY_nY), according to a fair coin toss.
call the adversary's function "A.guess(ASYM.encrypt_pkZ, ASYM.decrypt_skZ, (pkX_0..nX), (pkY_0..nY))", which tries to guess which chain of keys pkZ belongs to (by use of the oracles, rather than by knowing pkZ directly).

The advantage of an adversary A in attacking the recipient-hiding property of ASYM is defined as

Adv_Hide (A, ASYM) = max { 2.Pr[ (skX, pkX₀) _R ASYM.keygen;
(skY, pkY₀) _R ASYM.keygen;
pkX_i _R ASYM.blind(pkX_i-1), for i 1..nX;
pkY_i _R ASYM.blind(pkY_i-1), for i 1..nY;
b _R {0, 1};
(skZ, pkZ) = { (skX, pkX_nX), if b = 0
(skY, pkY_nY), if b = 1
A.guess(ASYM.encrypt_pkZ, ASYM.decrypt_skZ,
(pkX₀, ..., pkX_nX), (pkY₀, ..., pkY_nY)) = b] - 1
: nX ASYM.TIMES, nY ASYM.TIMES }

The insecurity level of ASYM for recipient hiding is

InSec_Hide(ASYM, t, q, ) = max_A {Adv_Hide(A, ASYM)}

where the maximum is taken over all adversaries A running in time + code size at most t, making at most q queries in total to its encryption and decryption oracles, these queries totaling at most

bits.

A recipient-hiding encryption scheme need not support key blinding, and the recipient hiding property can be useful on its own. Here are the corresponding definitions for an encryption scheme without key blinding, modelled as a tuple (ASYM.encrypt, ASYM.decrypt, ASYM.keygen) in the obvious way:

Adv_Hide' (A, ASYM) = 2.Pr [ (skX, pkX) _R ASYM.keygen;
(skY, pkY) _R ASYM.keygen;
b _R {0, 1};
(skZ, pkZ) = { (skX, pkX), if b = 0
(skY, pkY), if b = 1
A.guess(ASYM.encrypt_pkZ, ASYM.decrypt_skZ,
pkX, pkY) = b] - 1

InSec_Hide'(ASYM, t, q, ) = max_A {Adv_Hide'(A, ASYM)}

where the maximum is taken over all adversaries A running in time + code size at most t, making at most q queries in total to its encryption and decryption oracles, these queries totaling at most

bits.

Blinded-key indistinguishability under Chosen-Plaintext Attack

Two blinded keys corresponding to the same private key are indistinguishable, if even the private key holder cannot tell which key was used to produce a collection of ciphertexts.

[[need explanation of relationship between blinded keys]]

[Figure showing keys A, B, X, Y as described above, with arrows between indicating blinding]
Figure 3: Example of relationships between blinded keys.

The advantage of an adversary A in attacking the blinded-key indistinguishability of ASYM, under adaptive chosen plaintext attack is

Adv_Indist (A, ASYM) = max { 2.Pr[ (skZ, pkX₀) _R ASYM.keygen;
pkX_i _R ASYM.blind(pkX_i-1), for i 1..nX;
pkY_i = pkX_i, for i 0..n;
pkY_i _R ASYM.blind(pkY_i-1), for i n+1..nY;
b _R {0, 1};
pkZ = { pkX_nX, if b = 0
pkY_nY, if b = 1
A.guess(ASYM.encrypt_pkZ, skZ,
(pkX₀, ..., pkX_nX-1), (pkY₀, ..., pkX_nY-1) = b] - 1
: nX ASYM.TIMES, nY ASYM.TIMES, n 0..min(nX, nY-1) }

The insecurity level of ASYM for blinded-key indistinguishability under adaptive chosen plaintext attack is

InSec_Indist(ASYM, t, q, ) = max_A {Adv_Indist(A, ASYM)}

where the maximum is taken over all adversaries A running in time + code size at most t, making at most q queries to its encryption oracle, these queries totaling at most

bits.

Proposed schemes

We present two families of schemes, based on Diffie-Hellman and RSA. Only the Diffie-Hellman-based family supports key blinding.

Diffie-Hellman family

The schemes in this section are similar to DHAES ("Diffie-Hellman Authenticated Encryption Scheme"), as described in [DHAES]. They are proven equally as secure as DHAES for message privacy under chosen plaintext, chosen ciphertext and adaptive chosen ciphertext attack (and hence also achieve non-malleability), under the same assumptions.

...

RH-DHAES

This scheme is simply DHAES with the GROUP parameters common to all users (the symmetric cipher SYM, Message Authentication Code MAC, and hash function H must also be common). ...

Sketch of security proofs for RH-DHAES

Proofs of Message Privacy against Chosen Plaintext, Chosen Ciphertext and Adaptive Chosen Ciphertext attacks are identical to those given in [DHAES], because this scheme is simply a special case of DHAES.

Recipient-hiding can be proven as follows: the form of a ciphertext is (gk, tag, ct) where k is chosen at random from [1, |G|]. When common group and g parameters are used, gk therefore has the same distribution for all public keys. [[unfinished]]

A complete description of RH-DHAES and formal security proofs are given here.

BRH-DHAES

SRH-DHAES

Lemma 1:

Let u (0, 1) be a security parameter.
Let R₀ and R₁ be uniform independent random variables over finite sets S₀ and S₁ respectively, where S₀ S₁ and |S₀| > (1 - u) |S₁|.
The advantage of a computationally unbounded adversary A in distinguishing R₀ from R₁ given N > 0 samples is defined as follows:

Adv(A, S₀, S₁, N) = 2.Pr[ b _R {0, 1};
r₁, ..., r_N _R R_b;
A.guess(r₁, ..., r_N) = b] - 1

(A may depend on S₀ and S₁).
Then Adv(A, S₀, S₁, N) < N.u for any A.

Proof: here

It is more difficult to find an efficient steganographic encoding of some groups than others. For example, it seems to be difficult to find an encoding of elliptic curve points that cannot be distinguished from random (the point compression technique described in [HPCompression] comes close, but still retains approximately one bit of redundancy, because only points from a prime-order subgroup can be encoded).

Define a P-compact encoding as an encoding of group elements into t bits, such that P = |G| / 2^t is the probability that a random t-bit string will correspond to an element of G. Also suppose that a probability u is considered negligable. Then one possibility for a steganographic encoding of a group element y is given by the following algorithm:

i := 0;
while (i < log(u) / log(P)) {
    with probability P, break;
    output t bits chosen uniformly at random from the set of t-bit strings that are not encodings of group elements;
    i := i + 1;
}
output encoding of y;
output t.(log(u)/log(P) - i) random bits;

The probability of getting to iteration i < log(u)/log(P) without exiting the loop, is Pⁱ, therefore when i = log(u) / log(P), this probability is P^{log(u) / log(P)} = u. It is safe to exit the loop unconditionally at that point (i.e. to put a definite limit on the maximum length of the encoding), because u is a negligable probability.

Unfortunately, although this encoding can be proven to be indistinguishable from random, it is rather inefficient unless P is close to 1. For the case of points on an elliptic curve, using the P-compact encoding described earlier with P = 1/2, the encoded length must be expanded by about 40 times to achieve a reasonable level of security.

In some cases it may be possible to omit the final padding of t.(log(u)/log(P) - i) random bits, and rely on the fact that the following parts of the ciphertext (i.e. the MAC result and symmetric cipher output) are also indistinguishable from random. Note that the expected length of the encoded group element excluding this final padding is only t / P bits (e.g. 2t bits for elliptic curves). This approach should only be used when the full ciphertext visible to an attacker is padded to a fixed total length (as is commonly done in mix-net protocols, for example), or it is embedded into another file type in such a way that the length of the full ciphertext cannot be determined. Some care is needed to ensure that bias is not introduced by a user's response to cases where the plaintext message is too long to fit in the available number of bits. For example, if a message is close enough to the length limit that whether it "fits" within this limit depends on the random choices made in the encoding process, and as a result the user repeats the process until the message fits, a bias will be introduced (the number of t-bit blocks before the first block that encodes a valid group element will be less than expected for random data).

[[unfinished]]

Consider a protocol that has two types of messages, one public key encrypted using RH-DHAES over an elliptic curve, and another symmetrically encrypted, where it is essential that an attacker not be able to distinguish the two types. This can be achieved by adding a dummy header to the symmetrically encrypted messages, which follows the same distribution as the elliptic curve point gk in RH-DHAES. A practical disadvantage of this approach is that if the protocol ever needs to be changed not to use elliptic curves, or to use different curve parameters, then new messages will be distinguishable from old ones.

BSRH-DHAES

RSA family

SRH-RSA-OAEP. As implied by the name, this is a steganographically recipient-hiding algorithm, and makes use of a slightly modified version of OAEP ("Optimal Asymmetric Encryption Padding"). The original version of OAEP is described in [OAEP] and [PKCS1v2]. There is no non-steganographic version of this algorithm, but the steganographic property should not cause any problems even if it is not needed.

SRH-RSA-OAEP

[[unfinished]]

References

[DHAES]: Michel Abdalla, Mihir Bellare, Phillip Rogaway,
"DHAES: An Encryption Scheme Based on the Diffie-Hellman Problem,"
Contribution to IEEE P1363a.
http://grouper.ieee.org/groups/1363/contributions/dhaes.pdf (temporary URL), or
Theory of Cryptography Library.
http://philby.ucsd.edu/cryptolib/1999/99-07.html
[Concrete]: M. Bellare, A. Desai, E. Jokipii and P. Rogaway,
"A concrete security treatment of symmetric encryption: Analysis of the DES modes of operation."
Preliminary version in Proceedings of the 38th Symposium on Foundations of Computer Science, IEEE, 1997.
Current version at http://www-cse.ucsd.edu/users/mihir/papers/sym-enc.html
[FindThenGuess]: ...
[RSA20]: Dan Boneh,
"Twenty Years of Attacks on the RSA Cryptosystem,"
Notices of the American Mathematical Society (AMS), Vol. 46, No. 2, pp. 203-213, 1999.
http://crypto.stanford.edu/~dabo/abstracts/RSAattack-survey.html
[Mixnet]: ...
[Babel]: ...
[Nymserver]: ...

David Hopwood
<hopwood@zetnet.co.uk>

Scheme	Key blinding?	Steganographic?	Common parameters?	Based on
RH-DHAES	No	No	Yes	DHP
SRH-DHAES	No	Yes	Optional	DHP
BRH-DHAES	Yes	No	Yes	Decisional DHP
BSRH-DHAES	Yes	Yes	Yes	Decisional DHP
SRH-RSA-OAEP	No	Yes	No	RSAP

Adv_CPA(A, SYM) = 2.Pr[	K _R KEY;
	(P₀, P₁, state) := A.find(SYM.encrypt_K);
	b _R {0, 1};
	C _R SYM.encrypt(K, P_b);
	A.guess(SYM.encrypt_K, C, state) = b] - 1

Adv_CPA(A, ASYM) = max {	2.Pr[	(sk, pk₀) _R ASYM.keygen;
		pk_i _R ASYM.blind(pk_i-1), for i 1..n;
		(P₀, P₁, state) := A.find(pk₀, ..., pk_n);
		b _R {0, 1};
		C _R ASYM.encrypt(pk_n, P_b);
		A.guess(pk₀, ..., pk_n, C, state) = b] - 1
	: n ASYM.TIMES }

Adv_CCA1(A, ASYM) = max {	2.Pr[	(sk, pk₀) _R ASYM.keygen;
		pk_i _R ASYM.blind(pk_i-1), for i 1..n;
		(P₀, P₁, state) := A.find(ASYM.decrypt_sk, pk₀, ..., pk_n);
		b _R {0, 1};
		C _R ASYM.encrypt(pk_n, P_b);
		A.guess(pk₀, ..., pk_n, C, state) = b] - 1
	: n ASYM.TIMES }

Adv_CCA2(A, ASYM) = max {	2.Pr[	(sk, pk₀) _R ASYM.keygen;
		pk_i _R ASYM.blind(pk_i-1), for i 1..n;
		(P₀, P₁, state) := A.find(ASYM.decrypt_sk, pk₀, ..., pk_n);
		b _R {0, 1};
		C _R ASYM.encrypt(pk_n, P_b);
		A.guess(ASYM.decrypt_sk, pk₀, ..., pk_n, C, state) = b] - 1
	: n ASYM.TIMES }

Adv_Blind	(A, ASYM) = max {	2.Pr[	(skX, pkX₀) _R ASYM.keygen;
			(skY, pkY₀) _R ASYM.keygen;
			pkX_i _R ASYM.blind(pkX_i-1), for i 1..nX;
			pkY_i _R ASYM.blind(pkY_i-1), for i 1..nY;
			b _R {0, 1};
			pkZ =	{	pkX_nX, if b = 0
					pkY_nY, if b = 1
			A.guess(pkZ, (pkX₀, ..., pkX_nX-1), (pkY₀, ..., pkX_nY-1))
			= b] - 1
		: nX ASYM.TIMES, nY ASYM.TIMES }

Adv_Hide	(A, ASYM) = max {	2.Pr[	(skX, pkX₀) _R ASYM.keygen;
			(skY, pkY₀) _R ASYM.keygen;
			pkX_i _R ASYM.blind(pkX_i-1), for i 1..nX;
			pkY_i _R ASYM.blind(pkY_i-1), for i 1..nY;
			b _R {0, 1};
			(skZ, pkZ) =	{	(skX, pkX_nX), if b = 0
					(skY, pkY_nY), if b = 1
			A.guess(ASYM.encrypt_pkZ, ASYM.decrypt_skZ,
			(pkX₀, ..., pkX_nX), (pkY₀, ..., pkY_nY)) = b] - 1
		: nX ASYM.TIMES, nY ASYM.TIMES }

Adv_Hide'	(A, ASYM) = 2.Pr	[	(skX, pkX) _R ASYM.keygen;
			(skY, pkY) _R ASYM.keygen;
			b _R {0, 1};
			(skZ, pkZ) =	{	(skX, pkX), if b = 0
					(skY, pkY), if b = 1
			A.guess(ASYM.encrypt_pkZ, ASYM.decrypt_skZ,
			pkX, pkY) = b] - 1

Adv_Indist	(A, ASYM) = max {	2.Pr[	(skZ, pkX₀) _R ASYM.keygen;
			pkX_i _R ASYM.blind(pkX_i-1), for i 1..nX;
			pkY_i = pkX_i, for i 0..n;
			pkY_i _R ASYM.blind(pkY_i-1), for i n+1..nY;
			b _R {0, 1};
			pkZ =	{	pkX_nX, if b = 0
					pkY_nY, if b = 1
			A.guess(ASYM.encrypt_pkZ, skZ,
			(pkX₀, ..., pkX_nX-1), (pkY₀, ..., pkX_nY-1) = b] - 1
		: nX ASYM.TIMES, nY ASYM.TIMES, n 0..min(nX, nY-1) }

Adv(A, S₀, S₁, N) = 2.Pr[	b _R {0, 1};
	r₁, ..., r_N _R R_b;
	A.guess(r₁, ..., r_N) = b] - 1