Friday, 6 May 2016

Better tamper-proof ledgers

In the previous post, we pointed out how Certificate Transparency doesn't maintain cryptographic future or forwards integrity, and why that matters mainly if you try to reuse the log protocol for non-certificate things. Most other ledger schemes aren't much better. Blockchain - and especially private blockchains - can be tampered with by attackers with large network CPU, and Schneier-Kelsey logs and their derivatives allow verifiers to tamper with the logs.

It may very well be the case that there are public ledgers out there that have all of the properties I want, but if so, I can't find them and they're certainly not in widespread use. Indeed, most ledgers rely on access-control, probably because the few ledger designs that do exist have major usability or cryptographic problems that make them difficult for developers to actually use.

So let's fix that.

Future integrity is the easy bit. There's lots of work on this subject, and the crux of the solution is having roll-forward keys. For example, if K is our initial key, then Hash(K) is our second key and so on. This means if a hacker steals our key at time Tn, they can only tamper with older keys by performing inverse hashes - a problem that is cryptographically hard. This stops a hacker who hacks our logger from tampering with old log entries. This is really important in the case of, say, event logs.

Forwards integrity is a bit harder, and there's some work on this subject too. The crux to forwards integrity is having new keys have new entropy, but still be cryptographically linked to the previous key. This stops a hacker who hacks our logger from tampering with entries after they have been evicted (or left) the logger. This is particularly important in the case of, say, binary or update transparency and integrity.

SSL's perfect forward secrecy ciphers achieve forwards-integrity (for secrecy, rather than integrity reasons) by having a master SSL private key that signs individual ephemeral encryption keys, rather than encrypting with the master key directly. The key-exchange with the user then happens using the ephemeral key, not the signing key.

This deliberately limits the damage if a hacker steals your SSL private key. With knowledge of your private key, the hacker can obviously impersonate your site; they can even actively man-in-the-middle legitimate connections to your site. But they can't decrypt connections to your site unless they're still in control of your server. The private key they have allows them to forge ephemeral keys; it does not allow them to decrypt key-exchanges of ephemeral keys to which the hacker does not have the corresponding private key.

It's worth pointing out that the forward secrecy here is derived from the new private key introducing new entropy; not that the key is ephemeral. If the private keys of the ephemeral keys were derived from, say, a never-reseeded PRNG, the hacker can predict the ephemeral private keys if she steals the internal state of the PRNG and then leaves the server. This is why, for example, Schneier-Kelsey ledgers have no forwards integrity. That scheme has ephemeral keys, but because future keys are a fixed function of previous secrets, so a hacker who hacks the logger once can still predict future ephemeral keys.

Armed with these two facts, we can make a really simple ledger scheme that has both:

Ledger initialization
1. Generate a public/private key pair P0, P0'
2. Distribute P0 widely to verifiers out of band.

Adding the i-th entry to the ledger
1. Generate a public/private key pair P{i+1}, P{i+1}'
2. Sign the data Ei and the public key P{i+1} with private key Pi'.
3. Permanently erase Pi'

Public verification is now really simple. The first public key is known a priori. The first entry tells us the first entry's data and the public key that will be used to sign the second entry. We can then follow the chain of entries, each one signed by a key described by the previous entry.

Depending on how you look at it, the solution is really just a blend of different ideas already in use in other parts of cryptography, but now applied to ledgers. It uses asymmetric key-rolling like Axolotl, but for integrity, not secrecy; it uses asymmetric cryptography rather than symmetric cryptography to allow verifiers to verify without giving verifiers the ability to tamper so we can achieve zero-trust, and it uses block-chain like verification, but swapping mining for signing because most of the time we prefer fast centralized ledgers than slow decentralized ledgers that are inherently vulnerable to high network CPU attackers.

The scheme has some nice properties. It's centralized so only the logger gets to append entries. This is great for event logs, binary transparency, plane black-boxes and even most financial use-cases. It's publicly readable. Anyone can read or verify the entries without undermining the integrity of the system. Better still: it's zero trust. Since even the first private key is erased, the ledger cannot be tampered with by anyone: not the readers, not the verifiers - not even the logger itself.

We're also algorithm agnostic. Our only criteria for our algorithm is that generating new private keys requires new entropy if you want it to have forwards integrity, and that the algorithm we choose is a asymmetric signature algorithm. ECDSA or RSA would work just fine. But curiously we only ever need to sign data once. And for crypto-nerds that brings to mind a really cool asymmetric algorithm that nobody in their right mind ever uses: Lamport signatures.

Lamport signatures are old. They were invented in 1979, and, like one-time-pads, they have great security properties until you decide to use them more than once, at which point their security quickly collapses. But we only ever sign data exactly once, so how do they work?

A Lamport signature is coupled with two cryptographic hash functions: a data-hash and a key-hash, although normally both hashes are the same.

Generate Lamport keys
1. For i = 0 .. 2 x bitlen(key-hash)
  privatekey[i] = cryptorandom()
  publickey[i]  = keyhash(privatekey[i])

Generate Lamport signatures
1. Generate h=datahash(data)
2. For each bit b in h
   If b is 0: Signature[i] = privatekey[2*i]
   If b is 1: Signature[i] = privatekey[2*i+1]

Validating Lamport signatures
1. Generate h=datahash(data)
2. For each bit b in h
   If b is 0: Verify that keyhash(Signature[i]) = publickey[2*i]
   If b is 1: Verify that keyhash(Signature[i]) = publickey[2*i+1]

The way Lamport signatures work, is essentially thus. Suppose we want to use a 128-bit data hash algorithm. If so, we generate 256 "buckets" of random data (our private key), and publish the hash of each bucket (our public key). When we later want to sign some data, we hash that data to get a 128-bit hash. We use that hash to build a configuration of the 256 buckets that is unique to that hash value, and publish the original bits corresponding to those buckets.

A verifier can verify the integrity of the hash by hashing each element of our signature, and mapping them to the public key, which reveals the configuration. If that configuration exactly matches the hash of the data, the signature is valid. Otherwise it is not.

If a hacker modifies any bit of the data, the hash of the data will change and the configuration will not match the published signature. If the attacker wishes to forge the signature as well, she needs also to know the random data that led to the unpublished bucket corresponding to the bits that are different in the hash. To do so, she must either invert the key-hash, or brute-forcing the corresponding bucket of random bits. These are crypto-hard problems, so the scheme is secure.

In cryptography, asymmetric algorithms are based on functions that are easy to do forwards, and hard to do backwards. For example RSA is fundamentally based on the observation that factoring numbers is much harder than multiplying numbers. Lamport signatures are based on the observation that cryptographic hashes are easy to perform, but hard to invert.

So why does nobody use Lamport signatures in real life? Lamport signatures have big disadvantages compared with normal signatures. Most obviously, their security collapses if they are used more than once, but also their public keys and signatures are large.

Unfortunately, quantum-resistant asymmetric algorithms are rare. Bluntly, cryptographers have really screwed up here: the lack of standardized, peer-reviewed and easy-to-use quantum-resistant asymmetric algorithms is putting the entire world's security in jeopardy - but I digress. Here we have a nice example of an asymmetric algorithm that isn't based on the hardness of integer-factorization or discrete-logs, and which also isn't based on bleeding edge untested research. We can rely entirely on well-established principles of the security of cryptographic hashes, to quickly achieve quantum-computing resistance to the ledger, if you think your ledger's security against tampering is important enough to suck up the extra size costs of the public key and signature compared to ECDSA.

Of course, you can just use the scheme with RSA or ECDSA with the scheme provided above. Either way, it'll give you better tamper-proof characteristics than most cryptography or products on the market, including private-blockchains, ACL-based log schemes, or even Certificate Transparency.