Sep 23, 2009

pefs crypto primitives (updated)

Supported data encryption algorithms: AES and Camellia (with 128, 192 and 256 bits key sizes). Adding another block cipher with 128 block size should be trivial.

File names are always encrypted using AES-128 in CBC mode with zero IV. Encrypted file name consists of a unique per file tweak, checksum and name itself:
XBase64(checksum || E(tweak || filename))

Checksum is VMAC of encrypted tweak and file name:
checksum = VMAC(E(tweak || filename))

Both checksum and tweak have 64 bit length.

Main reason for not providing alternatives to name encryption algorithm is to keep design simple. Data encryption is different from name encryption here: encrypted data, unlike encrypted file name, is not parsed in any way by pefs and user expects to be able to use secure/fast/best-name cipher.

Name has such structure to work around some of CBC shortcomings. Random tweak value is placed at the beginning of the first encrypted block. That gives us unique encrypted file names and eliminates the need of dealing with initial IV (IV is zero and name is padded with zeros).

Encrypt-then-Authenticate construction is used. In addition to being most secure variant it allows checking if the name was encrypted by the given key without performing decryption. VMAC was chosen because of it performance characteristics and its ability to produce 64 bit MAC (without truncation of original result like in HMAC case). 64 bit size is almost mandatory here because larger MAC would result in much larger file name and it can hardly improve security. But the real reason is that no real "authentication" performed. It's designed to be just a cryptographic checksum (sounds incorrect but I can't find a better wording), so that breaking VMAC wouldn't result in breaking encrypted data, besides name checksum doesn't authenticate encrypted data. Checksum's main purpose is to be able to find a key the file is encrypted with.

Encrypted directory/socket/device name also contains tweak but it's used solely to randomize first CBC block and keep name structure uniform.

Idea behind tweak is to get unique per file ciphertext. Block ciphers (AES, Camellia) operate in XTS mode. 64 bit tweak value concatenated with 64 bit file offset form tweak used by XTS. All encryption operations performed on 4096 bytes sectors ("block" in XTS notion). Incomplete sectors are also encrypted according to XTS standard. But encryption of sectors smaller then 128 bits is not defined for XTS, in such situation CTR mode is used with tweak value generated according to XTS. If full 4096 byte sector is zero (all 4096 bytes are zero) before decryption it is not decrypted and treated as hole is sparse file.

4 different keys are used for cryptographic operations: one for name encryption, one for VMAC and two keys for data encryption as required by XTS. These keys are derived from 512 bit user supplied key using HKDF algorithm based on HMAC-SHA512 (IETF draft). The kernel part expects cryptographically strong key from userspace. This key is generated with PBKDF on using HMAC-SHA512 from passphrase.

Standard implementations of ciphers are used, but I do not use opencrypto framework, so there is no hardware acceleration available. opencrypto is not used mainly because it lacks full support for XTS mode (OpenBSD version is not able to encrypt incomplete sectors). opencrypto is rather heavy weight (extra initialization and memory allocations) so using may even worsen performance (hardware initialization costs for encrypting short chunks with different keys).

Besides pefs supports multiply keys, mixing files encrypted with different keys in single directory, transparent(unencrypted) mode, key chaining (adding a series of keys by entering just one of them) and more. I'm going to write about it soon.

Sep 16, 2009

pefs benchmark

pefs is a stacked cryptographic filesystem for FreeBSD. It has started as a Goggle Summer of Code'2009.

I've just come across performance comparison of eCryptfs against plain ext4 filesystem on Ubuntu, benchmark I was going to perform on my own.

I run dbench benchmarks regularly while working on pefs. But use it mostly as a stress test tool. I haven't reached the point I can start working on improving performance yet. But measuring pefs overhead is going to be interesting.

Unfortunately I fail to interpret dbench results from the article. They've used dbench 4, while I'm using dbench 3 from ports. But never the less result of 4-8 Mb/s looks too strange for me.

I've benchmarked 4 and 16 dbench clients on zfs, pefs with salsa20 encryption (256 bit key) on top of same zfs partition and pefs with aes encryption (128 bit key, ctr mode). I executed benchmark for 3 times in each setup.

First of all, cipher throughput:
salsa20 ~205.5 Mb/s
aes128 ~81.3 Mb/s

Benchmark results:

In both cases (4 and 16 clients) CPU was limiting factor, disks where mostly idle. This explains such divergence in zfs results, I've actually benchmarked zfs arc cache performance. Because of unpredictable zfs inner workings one can get the best aes128 result surprisingly close to the worst salsa20 one (salsa20 is ~2.5 times faster than aes128).

The graph comparing average values:

Conclusion is that pefs is 2x times slower. But that shouldn't be solely because of encryption. From my previous testing I can conclude that it's mostly filesystem overhead:

  • Current pefs implementation avoids data caching (to prevent double caching and restrain one's paranoia). I had version using buffer management for io (bread/bwrite) it's performance was awful, something like 20-30 Mb/s with salsa20 encryption.

  • Sparse files (add file resizing here too) are implemented very poorly: it requires exclusive lock and fills gap with zeros. While this "gap" is likely to be filled by application really soon.

  • Lookup operation is very expensive. It calls readdir and decrypts name for each directory entry.

eCryptfs IOzone benchmark also shows 2x difference