github has recently disabled downloads feature sysutils/pefs-kmod port was relaying on. github downloads were frustrating but less hassle than hosting PEFS elsewhere and having to deal with two bug trackers, repositories, etc.
Downloads moved to code.google.com, but I intend to continue using github for bug tracking and as main source repository.
PEFS - code.google.com/p/pefs
ggateu - code.google.com/p/ggateu
Gleb Kurtsou
Dec 28, 2012
Aug 21, 2012
A sunday well spent
Two month ago I've decided to encrypt some personal files, which wasn't a big deal thanks to stacked crypto file system we have in FreeBSD :) This weekend I've realized that the worst thing that could possibly happen to encrypted data did actually happen -- I couldn't recall the password.
Decision was made to roll back ZFS transactions to recover deleted snapshot. Performing such dangerous operation on personal laptop wasn't an option. Especially considering there was no spare disk laying around for a backup. So I've started hacking on ggate to create a union provider.
The key idea behind ggateu was not to use matadata storage but to relay on the fact that probability of encrypting disk sector into all zeros in negligible. That's the same trick used by PEFS to handle sparse files.
Utility available for download here: https://github.com/glk/ggateu
It worked for me, it's likely to be buggy and it's great for file system experiments!
p.s. Right after finishing initial version of ggateu it became clear ZFS uberblock for that particular transaction was long gone. By spending another half an hour typing semirandom passwords I've recovered my files. Still have no idea what the password was.
Apr 10, 2012
Ancient code
Graphics mode in recently release links 2.6 changelog[1] sounded vaguely familiar. So I've started googling for my first non-trivial contribution to a open source project. It was in 2003 - nearly 9 years ago[2]. I wish I have that patch to look at :D
[2] can't recall much of it, mozilla was too buggy and elinks unusable. used FreeBSD :)
Apr 14, 2011
Secure backups for a lazy developer
Developer is always afraid of loosing source code. As a rule after crash you'll be able restore all but several last revisions, or you'll get sources but have repository damaged. It doesn't happen often, but it's better to feel safe.
Backup of a central repository on server and personal project backup are two very different stories. Developers are too lazy to use server-like backup methods. It's quite common to create archive of all projects, encrypt it and store somewhere. Writing archive to CD-R or other media is time consuming and these CD-R's tend to get lost the next day you burn them. Another option is to use online backup service, especially considering that 1-2 GB is often available for free. The process can be automated, but has several limitations: entire archive has to be downloaded/uploaded, cumbersome data restore, complicated encryption keys/passwords management.
My list of desired features for a backup system:
To achieve these goals I use stacked cryptographic file system and free online
storage service: PEFS and Ubuntu One. PEFS is FreeBSD kernel level stacked file system and Ubuntu One seems to be the only service with open source client supporting synchronization. Linux users could use ecryptfs/encfs with dropbox or similar setup.
I backup git repositories, but it should work for regular data as well. git was chosen because it stores revision deltas in separate packs but not in individual files, i.e. new backup will create a new file and leave existing objects intact. Another reason is to prevent inconsistent syncs when using several clients. The way merge (if any) is performed by service provider remains mister but it can be easily controlled by version control system.
Layout:
Create backup file system:
Mount encrypted filesystem and add key:
Backup a project:
Here is what encrypted data looks like:
u1sync command line tool is used to sync data. You'd need to register at
Ubuntu One and extract oauth authentication tokens, step-by-step guide is here.
Initialize shared directory:
Sync it:
There is no u1sync port for FreeBSD yet, you'd have to install all dependencies from ports and make it run by hand. To speed things up I've also extended u1sync to store oauth tokens and created /backup/Makefile with mount/unmount/sync targets.
Backup of a central repository on server and personal project backup are two very different stories. Developers are too lazy to use server-like backup methods. It's quite common to create archive of all projects, encrypt it and store somewhere. Writing archive to CD-R or other media is time consuming and these CD-R's tend to get lost the next day you burn them. Another option is to use online backup service, especially considering that 1-2 GB is often available for free. The process can be automated, but has several limitations: entire archive has to be downloaded/uploaded, cumbersome data restore, complicated encryption keys/passwords management.
My list of desired features for a backup system:
- Store data encrypted
- Download/upload only bits that changed
- Automated synchronization
To achieve these goals I use stacked cryptographic file system and free online
storage service: PEFS and Ubuntu One. PEFS is FreeBSD kernel level stacked file system and Ubuntu One seems to be the only service with open source client supporting synchronization. Linux users could use ecryptfs/encfs with dropbox or similar setup.
I backup git repositories, but it should work for regular data as well. git was chosen because it stores revision deltas in separate packs but not in individual files, i.e. new backup will create a new file and leave existing objects intact. Another reason is to prevent inconsistent syncs when using several clients. The way merge (if any) is performed by service provider remains mister but it can be easily controlled by version control system.
Layout:
/backup/.encrypted
- encrypted data synced with online storage/backup/local
- decrypted local representationCreate backup file system:
# # (optional) zfs create tank/backup
# mkdir -p /backup/{.encrypted,local}
# pefs addchain -fZ /backup/.local
Enter password
Mount encrypted filesystem and add key:
# pefs mount /backup/.enrypted /backup/local
# pefs addkey -c /backup/local
Enter password
Backup a project:
# git init --bare /backup/local/project1.git
# cd project1_dir
# git remote add backup /backup/local/project1.git
# git push --mirror backup
Here is what encrypted data looks like:
# ls -A /backup/.encrypted
.D3ForUU+Xh8DEL3b1oRGYfD57VKQqLahzYZnHRjINSDT3hqJMRAPqA
.Gz8xQqNAzQFqQ4CiOZPGSlEIbf+tVvZHXG1SisReRxfwqpKJK0VYvA
.O96wecIt1g4YnhFTTp3KTW2mWFk33vQBt4ZBvX9ZbMPP5HCd0INbgg
.pefs.db
u1sync command line tool is used to sync data. You'd need to register at
Ubuntu One and extract oauth authentication tokens, step-by-step guide is here.
Initialize shared directory:
# u1sync --oauth FOO:BAR --init /backup/.encrypted
Sync it:
# u1sync --oauth FOO:BAR /backup/.encrypted
There is no u1sync port for FreeBSD yet, you'd have to install all dependencies from ports and make it run by hand. To speed things up I've also extended u1sync to store oauth tokens and created /backup/Makefile with mount/unmount/sync targets.
Jan 22, 2011
PEFS changelog
PEFS changelog since September 2010:
- Add AESNI hardware acceleration support.
- Several rename fixes: vnode reference leak, incorrect locking, livelock, missing lookup(), always perform nfs-style dummy rename.
- Skip directory entries with zero inode number (empty entry) (could result in reusing invalid entries).
- Fix mounting ZFS snapshots (incorrect vn_fullpath locking).
- Reduce possibility of free vnode shortage livelock by freeing vnode in-place for non-ZFS file systems and if called from vnlru proc in ZFS case. Add asyncreclaim mount option.
- Add missing vnode operations: vop_pathconf, vop_getacl. Improve error repoting in link() and truncate().
- Report correct max name size and max symlink size supported.
- Always use 4Kb block to support archs with large page sizes in future.
- Use AES128-CTR to encrypt keys in chain database, simplify Key-Encryption-Key generation procedure. Database has to be recreated anew after the change.
Sep 7, 2010
XTS support in pefs
I've replaced CTR encryption mode with XTS. Salsa20 stream cipher was also removed. CTR mode was inappropriate design for a filesystem, and allowed encrypted data to be easily manipulated by attacker and could even reveal plantext in cases when previous encrypted data snapshots where available to attacker, i.e. filesystem level snapshots. There should be no visible performance degradation because of switching to XTS.
CTR mode compatibility is not available to prevent further misuse, thus upgrade by hand would be necessary.
Besides I've also commited real support for sparse files and file extending, it should make filesystem faster in generic use cases. New version also contains fix for a race in rename operation.
I would like to ask people interested in getting such functionality in FreeBSD to give pefs a try, any feedback is welcome.
Installation instructions may be found in my message to freebsd-current maillist.
CTR mode compatibility is not available to prevent further misuse, thus upgrade by hand would be necessary.
Besides I've also commited real support for sparse files and file extending, it should make filesystem faster in generic use cases. New version also contains fix for a race in rename operation.
I would like to ask people interested in getting such functionality in FreeBSD to give pefs a try, any feedback is welcome.
Installation instructions may be found in my message to freebsd-current maillist.
May 6, 2010
Projects status
The oldest project l2filter is almost certainly doomed. Patch no longer apply after ipfw3 was imported to -CURRENT and then merged to 8-STABLE. It still applies to 7-STABLE, but I don't use 7-STABLE. Merging only support for layer2 filtering with pfil and pf should be rather trivial. I'd like to keep patches in sync with recent -CURRENT but.. no time, no testers.
pefs looks much better. I keep using it myself, it looks pretty stable with my workload, although I've once got a pefs-related panic but wasn't able to get a dump. I'd like to implement lazy file extend (lazily write encrypted zero ranges to file after extend) and post it on freebsd-hackers@ once again.
This summer I'll work on namecache. Project is rather ambitious and innovative, in few words it's about generalizing UFS' dirhash and exposing it to upper layers so that it can be used for reliable full path lookup.
pefs looks much better. I keep using it myself, it looks pretty stable with my workload, although I've once got a pefs-related panic but wasn't able to get a dump. I'd like to implement lazy file extend (lazily write encrypted zero ranges to file after extend) and post it on freebsd-hackers@ once again.
This summer I'll work on namecache. Project is rather ambitious and innovative, in few words it's about generalizing UFS' dirhash and exposing it to upper layers so that it can be used for reliable full path lookup.
Dec 8, 2009
pefs and l2filter moved to github
I've just moved pefs and l2filter development to github. Hope it helps people to follow development.
pefs repository (github.com/glk/pefs) can be used to to compile and run pefs without applying any patches.
pefs changelog:
l2filter repository (github.com/glk/l2filter) contains only patches. There is fresh patch against 8-STABLE with some minor improvements comparing to 7-STABLE version. 9-CURRENT patch is a bit outdated at the moment, as I'm waiting for Luigi Rizzo to finish ipfw refactoring work first.
pefs repository (github.com/glk/pefs) can be used to to compile and run pefs without applying any patches.
pefs changelog:
- support running on msdosfs
- enable dircache only on file systems that are known to support it
- add man page
- add pefs getkey command
- intial implementation of pefs PAM module
l2filter repository (github.com/glk/l2filter) contains only patches. There is fresh patch against 8-STABLE with some minor improvements comparing to 7-STABLE version. 9-CURRENT patch is a bit outdated at the moment, as I'm waiting for Luigi Rizzo to finish ipfw refactoring work first.
Oct 16, 2009
pefs dircache benchmark
I've recently added directory caching into pefs.
Despite of being directory listing cache (like dirhash for ufs) it also acts as encrypted file name cache. So that there is no need to decrypt names for the same entries all the time. That was really big issue because directory listing has to be reread on almost every vnode lookup operation. It made operations on directories with 1000 and more files too time consuming.
The cache is getting updated at two points: during vnode lookup operation and during readdir call. Vnode generation attribute is used to monitor directory changes (the same way NFS works) and expire the cache if it changes. There is no per-operation monitoring because that would violate stacked filesystem nature (and also complicate the code). There are some issues regarding large directories handling within dircache. First of all results of consequent readdir calls considered inconsistent, i.e cache expires if user provided buffer is too small to fit entire directory listing. And while doing a vnode lookup search doesn't terminate if matching directory entry found, it further traverses directory to update the cache.
There is vfs.pefs.dircache_enable sysctl to control cache validity. Setting it to zero would force always treating cache as invalid, and thus dircache would function only as a file name encryption cache.
At the moment caching is only enabled for name decryption, but there are operations like rm or rmdir which perform name encryption on every call to pass data to underlying filesystem. Enabling caching for such operations is not going to be hard, but I just want code to stabilize a bit before moving further.
I've performed two types of tests: dbench and handling directories with large number of files. I've used pefs mounted on top of tmpfs to measure pefs overhead but not disk io performance. Salsa20 algorithms with 256 bit key was chosen because of being the fastest available. Before each run underlying tmpfs filesystem was remounted. Each test was run for 3 times, and average of results is shown in charts (distribution was less then 2%). Also note that I've used kernel with some extra debugging compiled in (invariants, lock debugging).
dbench doesn't show much difference with dircache enable comparing to plain pefs and old pefs without dircache: 143,635 Mb/s against 116,746 Mb/s; although, it's 18% improvement witch is very good imho. Also interesting is that result gets just a bit lower after setting vfs.pefs.dircache_enable=0: 141,289 Mb/s with dircache_enable=0 against 143,635 Mb/s.
Dbench uses directories with small number of entries (usually ~20). That perfectly explains the results achieved. Handling large directories is where dircache shines. I've used the following trivial script for testing, it creates 1000 or 2000 files, does 'ls -l' and removes these files:
The chart speaks for itself. And per file overhead looks much closer to expected linear growth after running the same test for 3000 files:
Despite of being directory listing cache (like dirhash for ufs) it also acts as encrypted file name cache. So that there is no need to decrypt names for the same entries all the time. That was really big issue because directory listing has to be reread on almost every vnode lookup operation. It made operations on directories with 1000 and more files too time consuming.
The cache is getting updated at two points: during vnode lookup operation and during readdir call. Vnode generation attribute is used to monitor directory changes (the same way NFS works) and expire the cache if it changes. There is no per-operation monitoring because that would violate stacked filesystem nature (and also complicate the code). There are some issues regarding large directories handling within dircache. First of all results of consequent readdir calls considered inconsistent, i.e cache expires if user provided buffer is too small to fit entire directory listing. And while doing a vnode lookup search doesn't terminate if matching directory entry found, it further traverses directory to update the cache.
There is vfs.pefs.dircache_enable sysctl to control cache validity. Setting it to zero would force always treating cache as invalid, and thus dircache would function only as a file name encryption cache.
At the moment caching is only enabled for name decryption, but there are operations like rm or rmdir which perform name encryption on every call to pass data to underlying filesystem. Enabling caching for such operations is not going to be hard, but I just want code to stabilize a bit before moving further.
I've performed two types of tests: dbench and handling directories with large number of files. I've used pefs mounted on top of tmpfs to measure pefs overhead but not disk io performance. Salsa20 algorithms with 256 bit key was chosen because of being the fastest available. Before each run underlying tmpfs filesystem was remounted. Each test was run for 3 times, and average of results is shown in charts (distribution was less then 2%). Also note that I've used kernel with some extra debugging compiled in (invariants, lock debugging).
dbench doesn't show much difference with dircache enable comparing to plain pefs and old pefs without dircache: 143,635 Mb/s against 116,746 Mb/s; although, it's 18% improvement witch is very good imho. Also interesting is that result gets just a bit lower after setting vfs.pefs.dircache_enable=0: 141,289 Mb/s with dircache_enable=0 against 143,635 Mb/s.
Dbench uses directories with small number of entries (usually ~20). That perfectly explains the results achieved. Handling large directories is where dircache shines. I've used the following trivial script for testing, it creates 1000 or 2000 files, does 'ls -l' and removes these files:
for i in `jot 1000`; do
touch test-$i
done
ls -Al >/dev/null
find . -name test-\* -exec rm '{}' +
The chart speaks for itself. And per file overhead looks much closer to expected linear growth after running the same test for 3000 files:
Oct 1, 2009
Encrypting private directory with pefs
pefs is a kernel level cryptographic filesystem. It works transparently on top of other filesystems and doesn't require root privileges. There is no need to allocate another partition and take additional care of backups, resizing partition when it fills up, etc.
After installing pefs create a new directory to encrypt. Let it be ~/Private:
And mount pefs on top of it (root privileges are necessary to mount filesystem unless you have vfs.usermount sysctl set to non-zero):
At this point ~/Private behaves like read-only filesystem because no keys are set up yet. To make it useful add a new key:
After entering a passphrase, you can check active keys:
As you can see AES algorithm is used by default (in CTR mode with 256 bit key). It can be changed with pefs addkey -a option.
You should take into account that pefs doesn't save any metadata. That means that there is no way for filesystem to "verify" the key. To work around it key chaining can be used (pefs showchain, setchain, delchain). I'm going show how it works in next posts.
Let's give it a try:
Here is what it looks like at lower filesystem level:
% ls -Al ~/Private
total 1
-rw-r--r-- 1 gleb gleb 12 Oct 1 12:55 .DU6eudxZGtO8Ry_2Z3Sl+tq2hV3O75jq
% hd ~/Private/.DU6eudxZGtO8Ry_2Z3Sl+tq2hV3O75jq
00000000 7f 1e 1b 05 fc 8a 5c 38 fc d8 2d 5f |......\8..-_|
0000000c
Your result is going to be different because pefs uses random tweak value to encrypt files. This tweak is saved in encrypted file name. Using the tweak also means that the same files have different encrypted content.
After installing pefs create a new directory to encrypt. Let it be ~/Private:
% mkdir ~/Private
And mount pefs on top of it (root privileges are necessary to mount filesystem unless you have vfs.usermount sysctl set to non-zero):
% pefs mount ~/Private ~/Private
At this point ~/Private behaves like read-only filesystem because no keys are set up yet. To make it useful add a new key:
% pefs addkey ~/Private
After entering a passphrase, you can check active keys:
% pefs showkeys ~/Private
Keys:
0 b0bed3f7f33e461b aes256-ctr
As you can see AES algorithm is used by default (in CTR mode with 256 bit key). It can be changed with pefs addkey -a option.
You should take into account that pefs doesn't save any metadata. That means that there is no way for filesystem to "verify" the key. To work around it key chaining can be used (pefs showchain, setchain, delchain). I'm going show how it works in next posts.
Let's give it a try:
% echo "Hello WORLD" > ~/Private/test
% ls -Al ~/Private
total 1
-rw-r--r-- 1 gleb gleb 12 Oct 1 12:55 test
% cat ~/Private/test
Hello WORLD
Here is what it looks like at lower filesystem level:
% pefs unmount ~/Private
% ls -Al ~/Private
total 1
-rw-r--r-- 1 gleb gleb 12 Oct 1 12:55 .DU6eudxZGtO8Ry_2Z3Sl+tq2hV3O75jq
% hd ~/Private/.DU6eudxZGtO8Ry_2Z3Sl+tq2hV3O75jq
00000000 7f 1e 1b 05 fc 8a 5c 38 fc d8 2d 5f |......\8..-_|
0000000c
Your result is going to be different because pefs uses random tweak value to encrypt files. This tweak is saved in encrypted file name. Using the tweak also means that the same files have different encrypted content.
Subscribe to:
Posts (Atom)