Ticket 703: qvm-backup: save backups in AppVM

1572 views
Skip to first unread message

Andrew Sorensen

unread,
Mar 27, 2013, 2:36:35 PM3/27/13
to qubes...@googlegroups.com
I've been working on additions to qvm-core/qubesutils.py to add functionality for compressing files in Dom0 and sending them to an appvm. Right now, I'm just using a hacked up version of qvm-backup to perform the backups, but eventually I want to make that compatible with the new system

My to do the actual copy looks something like this: 
        compressor = subprocess.Popen (["tar", "-PcOz", file["path"]], stdout=subprocess.PIPE)
        subprocess.Popen (["qvm-run", "--pass-io", "-p", appvm, "cat > " + dest_dir + file["basename"] + ".tar.gz"], stdin=compressor.stdout)

(There's an issue with this code: multiple file backups running at the same time, I have yet to fix that).

Here's some questions that would be interesting to get thoughts on:

Should backups to Dom0 be supported at all? (or should it require a working AppVM?) - What about restore?

How should encryption of backups be handled? (Dom0 sounds the best, but how should the user provide their key?) - What about users who create a "backups" appvm and use LUKS on their drive?

How should the command line qvm-backup command work with this additional functionality to backup into an appvm? qvm-backup backups:/mnt/removable/qubes-backups maybe?

What about the case where users have AppVMs already running? (could sync(1) them and pause the domain, then perform the backup)

Joanna Rutkowska

unread,
Mar 27, 2013, 3:17:35 PM3/27/13
to qubes...@googlegroups.com, Andrew Sorensen
On 03/27/13 19:36, Andrew Sorensen wrote:
> I've been working on additions to qvm-core/qubesutils.py to add
> functionality for compressing files in Dom0 and sending them to an appvm.
> Right now, I'm just using a hacked up version of qvm-backup to perform the
> backups, but eventually I want to make that compatible with the new system
>
> My to do the actual copy looks something like this:
> compressor = subprocess.Popen (["tar", "-PcOz", file["path"]],
> stdout=subprocess.PIPE)
> subprocess.Popen (["qvm-run", "--pass-io", "-p", appvm, "cat > " +
> dest_dir + file["basename"] + ".tar.gz"], stdin=compressor.stdout)
>
> (There's an issue with this code: multiple file backups running at the same
> time, I have yet to fix that).
>
> Here's some questions that would be interesting to get thoughts on:
>
> Should backups to Dom0 be supported at all? (or should it require a working
> AppVM?) - What about restore?
>
I see no reason why to block backups to Dom0.

> How should encryption of backups be handled?

Encryption should definitely be done by Dom0. Otherwise the VM where the
backup were to be saved (or from which to be restored) would become as
privileged as Dom0, which would somehow defy the sense of backups to AppVMs.

> (Dom0 sounds the best, but how
> should the user provide their key?) - What about users who create a
> "backups" appvm and use LUKS on their drive?
>

How about the same way as cryptsetup asks for passphrase? And then
passed as a string argument to either backup_prepare() or do_backup(),
to allow the passprase to be passed also from the manager.


> How should the command line qvm-backup command work with this additional
> functionality to backup into an appvm? qvm-backup
> backups:/mnt/removable/qubes-backups maybe?
>

So 'backups' above, I assume, is the AppVM where the encrypted blob of
the backup is to be stored, correct? If so, then it looks good to me.

BTW, I don't really see a reason for using a dedicated 'backups' AppVM.
Instead I would expect people to use something like 'usbvm', or
'personal' as a backup VM (in this example the 'personal' might an AppVM
with access to my home NAS).

> What about the case where users have AppVMs already running? (could sync(1)
> them and pause the domain, then perform the backup)
>

That's a separate issue. I'm not a filesystem expert, so I don't know if
this will always work, but sounds like it might. But again, let's not
mix different functionalists into one task.

joanna.

signature.asc

Outback Dingo

unread,
Mar 27, 2013, 4:52:26 PM3/27/13
to qubes...@googlegroups.com, Andrew Sorensen
be nice if backups could also be sent over remote via https or ssh to a remote server

Joanna Rutkowska

unread,
Mar 27, 2013, 5:56:01 PM3/27/13
to qubes...@googlegroups.com, Outback Dingo, Andrew Sorensen
On 03/27/13 21:52, Outback Dingo wrote:
> be nice if backups could also be sent over remote via https or ssh to a
> remote server
>

What you do with your backup once it makes it to an AppVM of your
choice, is totally up to you. You could then use all the Linux
networking/storage tools, to send them over SMB, SSH, upload to a
WebDAV, S3, and generally god-knows-what-else. But the job of qvm-back
is only to store the (encrypted) blob in the selected AppVM. No more, no
less.

joanna.
signature.asc

Andrew Sorensen

unread,
Mar 27, 2013, 7:45:27 PM3/27/13
to qubes...@googlegroups.com, Andrew Sorensen
On Wednesday, March 27, 2013 12:17:35 PM UTC-7, joanna wrote:
On 03/27/13 19:36, Andrew Sorensen wrote:
> I've been working on additions to qvm-core/qubesutils.py to add
> functionality for compressing files in Dom0 and sending them to an appvm.
> Right now, I'm just using a hacked up version of qvm-backup to perform the
> backups, but eventually I want to make that compatible with the new system
>
> My to do the actual copy looks something like this:
>         compressor = subprocess.Popen (["tar", "-PcOz", file["path"]],
> stdout=subprocess.PIPE)
>         subprocess.Popen (["qvm-run", "--pass-io", "-p", appvm, "cat > " +
> dest_dir + file["basename"] + ".tar.gz"], stdin=compressor.stdout)
>
> (There's an issue with this code: multiple file backups running at the same
> time, I have yet to fix that).
>
> Here's some questions that would be interesting to get thoughts on:
>
> Should backups to Dom0 be supported at all? (or should it require a working
> AppVM?) - What about restore?
>
I see no reason why to block backups to Dom0.

> How should encryption of backups be handled?

Encryption should definitely be done by Dom0. Otherwise the VM where the
backup were to be saved (or from which to be restored) would become as
privileged as Dom0, which would somehow defy the sense of backups to AppVMs. 

I will implement optional encryption in Dom0 then (in my particular use case, I have a separate VM that handles the backups).

> (Dom0 sounds the best, but how
> should the user provide their key?) - What about users who create a
> "backups" appvm and use LUKS on their drive?
>

How about the same way as cryptsetup asks for passphrase? And then
passed as a string argument to either backup_prepare() or do_backup(),
to allow the passprase to be passed also from the manager.

 
Is it safe to pass passwords through pipes using qvm-run (or does that keep a log somewhere?)


> How should the command line qvm-backup command work with this additional
> functionality to backup into an appvm? qvm-backup
> backups:/mnt/removable/qubes-backups maybe?
>

So 'backups' above, I assume, is the AppVM where the encrypted blob of
the backup is to be stored, correct? If so, then it looks good to me.  

Yes, that is correct. I was trying to keep to the standard I saw for creating HVMs from CDs.
 
BTW, I don't really see a reason for using a dedicated 'backups' AppVM.
Instead I would expect people to use something like 'usbvm', or
'personal' as a backup VM (in this example the 'personal' might an AppVM
with access to my home NAS).

Okay.
 

> What about the case where users have AppVMs already running? (could sync(1)
> them and pause the domain, then perform the backup)
>

That's a separate issue. I'm not a filesystem expert, so I don't know if
this will always work, but sounds like it might. But again, let's not
mix different functionalists into one task.

I agree. This should be moved to a separate ticket.
 

joanna.

Andrew Sorensen

unread,
Mar 27, 2013, 7:47:25 PM3/27/13
to qubes...@googlegroups.com, Outback Dingo, Andrew Sorensen
On Wednesday, March 27, 2013 2:56:01 PM UTC-7, joanna wrote:
On 03/27/13 21:52, Outback Dingo wrote:
> be nice if backups could also be sent over remote via https or ssh to a
> remote server
>

What you do with your backup once it makes it to an AppVM of your
choice, is totally up to you. You could then use all the Linux
networking/storage tools, to send them over SMB, SSH, upload to a
WebDAV, S3, and generally god-knows-what-else. But the job of qvm-back
is only to store the (encrypted) blob in the selected AppVM. No more, no
less.

 
Currently I'm just piping the compressed files into the AppVM. Should I keep the compression in Dom0, or drop it/make it optional? I assume if we are going to encrypt, then I need to do compression first (eg: in Dom0).

Marek Marczykowski

unread,
Mar 28, 2013, 12:01:45 AM3/28/13
to qubes...@googlegroups.com, Andrew Sorensen, Outback Dingo
On 28.03.2013 00:47, Andrew Sorensen wrote:
> On Wednesday, March 27, 2013 2:56:01 PM UTC-7, joanna wrote:
>
>> On 03/27/13 21:52, Outback Dingo wrote:
>>> be nice if backups could also be sent over remote via https or ssh to a
>>> remote server
>>>
>>
>> What you do with your backup once it makes it to an AppVM of your
>> choice, is totally up to you. You could then use all the Linux
>> networking/storage tools, to send them over SMB, SSH, upload to a
>> WebDAV, S3, and generally god-knows-what-else. But the job of qvm-back
>> is only to store the (encrypted) blob in the selected AppVM. No more, no
>> less.
>>
>>
> Currently I'm just piping the compressed files into the AppVM. Should I
> keep the compression in Dom0, or drop it/make it optional? I assume if we
> are going to encrypt, then I need to do compression first (eg: in Dom0).

gpg already compress the data.

I see the point in supporting direct sent over https or whatever - this will
not require having space for full backup in AppVM.

So perhaps backup should be _one_ blob, which is sth like:
tar c <list of files to backup> | gpg | qvm-run --pass-io backups 'QUBESRPC
qubes.StoreBackup dom0'

Then qubes.StoreBackup service (configured as any other Qubes RPC service) can
do whatever you want with backup - store on mounted drive, send over the
network, pipe to /dev/null...

'dom0' in above command is source VM name.

BTW instead of qvm-run you can use vm.run(..., passio_popen=True), this will
return subprocess.Popen return value (so pipes will be available).

--
Best Regards / Pozdrawiam,
Marek Marczykowski
Invisible Things Lab

signature.asc

Marek Marczykowski

unread,
Mar 28, 2013, 12:03:31 AM3/28/13
to qubes...@googlegroups.com, Andrew Sorensen
On 28.03.2013 00:45, Andrew Sorensen wrote:
> On Wednesday, March 27, 2013 12:17:35 PM UTC-7, joanna wrote:
>
>> On 03/27/13 19:36, Andrew Sorensen wrote:
>>> (Dom0 sounds the best, but how
>>> should the user provide their key?) - What about users who create a
>>> "backups" appvm and use LUKS on their drive?
>>>
>>
>> How about the same way as cryptsetup asks for passphrase? And then
>> passed as a string argument to either backup_prepare() or do_backup(),
>> to allow the passprase to be passed also from the manager.
>>
>>
> Is it safe to pass passwords through pipes using qvm-run (or does that keep
> a log somewhere?)

Why do you want pipe password to other VM? It shouldn't leave dom0.
signature.asc

Andrew Sorensen

unread,
Mar 28, 2013, 12:06:10 AM3/28/13
to qubes...@googlegroups.com, Andrew Sorensen
See joanna's comment:

Marek Marczykowski

unread,
Mar 28, 2013, 12:10:05 AM3/28/13
to qubes...@googlegroups.com, Andrew Sorensen
Yes, but it still should be in dom0, VM should get only already encrypted
data. And yes - passing password through pipe should be safe.
signature.asc

Andrew Sorensen

unread,
Mar 28, 2013, 12:13:24 AM3/28/13
to qubes...@googlegroups.com, Andrew Sorensen
Correct, but this considers the case that the user requires a LUKS device to be decrypted in the AppVM. It's probably best that this responsibility be left to the AppVM, and the backup system left standalone.

Andrew Sorensen

unread,
Mar 28, 2013, 12:14:58 AM3/28/13
to qubes...@googlegroups.com, Andrew Sorensen, Outback Dingo
On Wednesday, March 27, 2013 9:01:45 PM UTC-7, Marek Marczykowski wrote:
On 28.03.2013 00:47, Andrew Sorensen wrote:
> On Wednesday, March 27, 2013 2:56:01 PM UTC-7, joanna wrote:
>
>> On 03/27/13 21:52, Outback Dingo wrote:
>>> be nice if backups could also be sent over remote via https or ssh to a
>>> remote server
>>>
>>
>> What you do with your backup once it makes it to an AppVM of your
>> choice, is totally up to you. You could then use all the Linux
>> networking/storage tools, to send them over SMB, SSH, upload to a
>> WebDAV, S3, and generally god-knows-what-else. But the job of qvm-back
>> is only to store the (encrypted) blob in the selected AppVM. No more, no
>> less.
>>
>>  
> Currently I'm just piping the compressed files into the AppVM. Should I
> keep the compression in Dom0, or drop it/make it optional? I assume if we
> are going to encrypt, then I need to do compression first (eg: in Dom0).

gpg already compress the data.

I see the point in supporting direct sent over https or whatever - this will
not require having space for full backup in AppVM.

So perhaps backup should be _one_ blob, which is sth like:
tar c <list of files to backup> | gpg | qvm-run --pass-io backups 'QUBESRPC
qubes.StoreBackup dom0'

Then qubes.StoreBackup service (configured as any other Qubes RPC service) can
do whatever you want with backup - store on mounted drive, send over the
network, pipe to /dev/null...

I like this idea.
 

'dom0' in above command is source VM name.

BTW instead of qvm-run you can use vm.run(..., passio_popen=True), this will
return subprocess.Popen return value (so pipes will be available).

Thanks. I knew there had to be a better way. I'll clean up that code.

Marek Marczykowski

unread,
Mar 28, 2013, 12:45:10 AM3/28/13
to qubes...@googlegroups.com, Andrew Sorensen, Outback Dingo
Some catches in above approach:
1. enable sparse file support in tar (AFAIR -S option)
2. directory structure in tar: IMHO it should be based on /var/lib/qubes
layout, so tar would contain:
./qubes.xml
./appvms/work/private.img
./appvms/work/work.conf
./appvms/work/firewall.xml
(...)
3. VMs selection at restore. This can be some problem: currently we need
qubes.xml and list of files to check which VMs are present in backup, then
give the user choice about what VMs should be restored, then actual restore.
Ideally we'd like not to download the whole backup (about 100GB in my case)
twice. Also it's good to allow restore one VM without requiring disk space for
whole backup. Concrete use case:
- whole backup takes 120GB
- system has 20GB free
- I want to restore one of my HVM, which takes 10GB, so I remove it first
(now have 30GB free)
- here I have enough space for VM to restore, but too small for unpack of
whole backup

If you have some idea how to solve 3rd problem, it would be nice. If not, we
can leave it for later (anyway backups are mostly to restore whole system
after eg. disk crash). Maybe we should ensure right now that backup will have
sufficient information to enable such VM selection before unpacking whole
backup in the future - like having list of VMs in backup at the beginning of
the archive (qubes.xml in backup currently stores all VMs present in system,
not only those backed up).

One more thing to consider about restore: if we unpack backup, it will be
already copy of backup (original backup will stay intact), so we can move
files to destination directories not copy.
signature.asc

Joanna Rutkowska

unread,
Mar 28, 2013, 6:30:16 AM3/28/13
to qubes...@googlegroups.com, Andrew Sorensen
No, I didn't mean LUKS at all! Again, all the encryption, decryption and
passphrase management should be done in Dom0 only. What I meant is to
use the same function for passphrase prompt as e.g. the cryptsetup uses.
Or gpg. Or ssh. Or, perhaps we can even use sscanf() -- I wouldn't be
picky about that.

Again, let me reiterate: whatever backup blob leaves dom0 it must
already be encrypted and signed. The AppVMs cannot be trusted for
anything regarding backup handling! Again, if they were trusted, then
they automatically become as privileged as Dom0, which in turn makes it
pointless to use a separate AppVM for backups!

And, consequently: whatever we get from the AppVM (the backup blob)
should first be verified (gpgv --keyring dom0-backup-keyring?) and only
then the unpacking should be attempted. Otherwise we risk that a
malicious AppVM might send a malformed backup blob that might exploit
Dom0 untaring/uncompressing code.

joanna.

signature.asc

Marek Marczykowski

unread,
Mar 28, 2013, 7:51:28 AM3/28/13
to qubes...@googlegroups.com, Joanna Rutkowska, Andrew Sorensen
The last part could be tricky: how verify backup signature if my keys are in
backup itself? Some additional method to import keys for backup, right? So how
verify that keys...

For just encryption we can use symmetric encryption (so only passphrase
needed). Anyway if someone is able to:
1. encrypt backup using right passphrase (so know passphrase)
and
2. somehow substitute original backup - will probably have access to original
backup this way,
then already game is over.
So I think we can stick with simple gpg --symmetric.
signature.asc

Joanna Rutkowska

unread,
Mar 28, 2013, 8:20:46 AM3/28/13
to Marek Marczykowski, qubes...@googlegroups.com, Andrew Sorensen
Good point :)

> For just encryption we can use symmetric encryption (so only passphrase
> needed). Anyway if someone is able to:
> 1. encrypt backup using right passphrase (so know passphrase)
> and
> 2. somehow substitute original backup - will probably have access to original
> backup this way,
> then already game is over.
> So I think we can stick with simple gpg --symmetric.
>
I'm afraid of an attack where some bits in the encrypted blob are (more
or less) randomly modified by the attacker, which would result in some
(more or less) garbage after decryption. In the best case these would
cause a hard-to-detect DoS on the backups, and in the worst case (but
rather unlikely) something worse. But the stealth DoS attacks on backups
is what I fear the most. Of course a compromised AppVM can always do a
DoS on my backup, but otherwise I can easily detect it (and copy the
backup from the CDROM or whatever, again).

So, perhaps a simple HMAC would do?

openssl dgst -hmac <passphrase> backup.blob

Perhaps, just to be extra safe the passphrase used for HMAC can be
obtained by hashing the actual backup encryption passphrase.

joanna.

signature.asc

Joanna Rutkowska

unread,
Mar 28, 2013, 8:22:15 AM3/28/13
to Marek Marczykowski, qubes...@googlegroups.com, Andrew Sorensen
Ah, and if that wasn't clear: such HMAC would be distributed together
with the blob.tgz. Perhaps concatenated on top of it, just to keep it as
one file?

j.

signature.asc

Marek Marczykowski

unread,
Mar 28, 2013, 8:26:35 AM3/28/13
to Joanna Rutkowska, qubes...@googlegroups.com, Andrew Sorensen
Doesn't gpg already does something like this (checksum of plain data checked
during decryption)? If not - indeed such HMAC would be desirable.
signature.asc

Joanna Rutkowska

unread,
Mar 28, 2013, 9:52:33 AM3/28/13
to Marek Marczykowski, qubes...@googlegroups.com, Andrew Sorensen
It doesn't:


# Create a test blob (I use no compression to make the experiment easier):

[user@work-pub test]$ dd if=/dev/zero of=test.bin bs=1k count=1k
1024+0 records in
1024+0 records out
1048576 bytes (1.0 MB) copied, 0.0060514 s, 173 MB/s
[user@work-pub test]$ gpg -c --compress-level 0 test.bin
[user@work-pub test]$ ll
total 2052
-rw-rw-r-- 1 user user 1048576 Mar 28 13:45 test.bin
-rw-rw-r-- 1 user user 1048625 Mar 28 13:45 test.bin.gpg
[user@work-pub test]$ cp test.bin.gpg test-hacked.bin.gpg

# Let's introduce one byte mutation at a random position in the
encrypted file:

[user@work-pub test]$ dd if=/dev/zero of=test-hacked.bin.gpg bs=1
count=1 seek=666999 conv=notrunc
1+0 records in
1+0 records out
1 byte (1 B) copied, 5.4258e-05 s, 18.4 kB/s

# Let's verify the mutation made it to the encrypted blob indeed:

[user@work-pub test]$ xxd test.bin.gpg > test.bin.gpg.xxd
[user@work-pub test]$ xxd test-hacked.bin.gpg > test-hacked.bin.gpg.xxd
[user@work-pub test]$ diff test.bin.gpg.xxd test-hacked.bin.gpg.xxd
41688c41688
< 00a2d70: 248a 7146 60ba f0fd c42e 3341 f6fc 5eef $.qF`.....3A..^.
---
> 00a2d70: 248a 7146 60ba f000 c42e 3341 f6fc 5eef $.qF`.....3A..^.

# (Nore the zero byte above instead of 0xfd)

# Now, let's try to decrypt it:

[user@work-pub test]$ gpg test-hacked.bin.gpg
gpg: CAST5 encrypted data
gpg: encrypted with 1 passphrase
gpg: WARNING: message was not integrity protected

# Aha, we got a warning about no integrity protection. But the same
warning is printed also for the unmutated blob:

[user@work-pub test]$ gpg test.bin.gpg
gpg: CAST5 encrypted data
gpg: encrypted with 1 passphrase
File `test.bin' exists. Overwrite? (y/N) n
Enter new filename: test2.bin
gpg: WARNING: message was not integrity protected

# Now let's see how the decrypted blobs were affected:

[user@work-pub test]$ xxd test.bin > test.bin.xxd
[user@work-pub test]$ xxd test-hacked.bin > test-hacked.bin.xxd
[user@work-pub test]$ diff test.bin.xxd test-hacked.bin.xxd
41685,41686c41685,41686
< 00a2d40: 0000 0000 0000 0000 0000 0000 0000 0000 ................
< 00a2d50: 0000 0000 0000 0000 0000 0000 0000 0000 ................
---
> 00a2d40: 0000 0000 0000 fd00 0000 0000 0036 46b3 .............6F.
> 00a2d50: 4e0f 442b dc00 0000 0000 0000 0000 0000 N.D+............

As expected a single byte modification to the encrypted blob caused
multiple mutations in the original, decrypted blob.

Now, the question is -- how to enable integrity protection for gpg -c? I
can't find any info about this in the manual...?

joanna.

signature.asc

Joanna Rutkowska

unread,
Mar 28, 2013, 10:13:17 AM3/28/13
to qubes...@googlegroups.com, Marek Marczykowski, Andrew Sorensen
Apparently the --force-mdc switch seems to be what we need:

[user@work-pub test]$ gpg -c --force-mdc --compress-level 0 test.bin
[user@work-pub test]$ cp test.bin.gpg test-hacked.bin.gpg
[user@work-pub test]$ dd if=/dev/zero of=test-hacked.bin.gpg bs=1
count=1 seek=666999 conv=notrunc
1+0 records in
1+0 records out
1 byte (1 B) copied, 4.4214e-05 s, 22.6 kB/s
[user@work-pub test]$ gpg test-hacked.bin.gpg
gpg: CAST5 encrypted data
gpg: encrypted with 1 passphrase
gpg: WARNING: encrypted message has been manipulated!
[user@work-pub test]$ echo $?
2

However, notice how things break down if we used compression and if we
modified the encrypted blob then (specially, superficially enlonging it):

[user@work-pub test]$ gpg -c --force-mdc test.bin
[user@work-pub test]$ ll
total 1028
-rw-rw-r-- 1 user user 1048576 Mar 28 13:45 test.bin
-rw-rw-r-- 1 user user 1115 Mar 28 14:07 test.bin.gpg
[user@work-pub test]$ cp test.bin.gpg test-hacked.bin.gpg
[user@work-pub test]$ dd if=/dev/zero of=test-hacked.bin.gpg bs=1
count=1 seek=666999 conv=notrunc
1+0 records in
1+0 records out
1 byte (1 B) copied, 5.2511e-05 s, 19.0 kB/s
[user@work-pub test]$ ll
total 1036
-rw-rw-r-- 1 user user 667000 Mar 28 14:07 test-hacked.bin.gpg
-rw-rw-r-- 1 user user 1048576 Mar 28 13:45 test.bin
-rw-rw-r-- 1 user user 1115 Mar 28 14:07 test.bin.gpg
[user@work-pub test]$ gpg test-hacked.bin.gpg
gpg: CAST5 encrypted data
gpg: encrypted with 1 passphrase
gpg: [don't know]: indeterminate length for invalid packet type 10
gpg: mdc_packet with invalid encoding
gpg: decryption failed: invalid packet
gpg: encrypted with 1 passphrase
gpg: assuming IDEA encrypted data
gpg: [don't know]: invalid packet (ctb=47)
gpg: WARNING: message was not integrity protected
gpg: WARNING: multiple plaintexts seen
gpg: handle plaintext failed: unexpected data
gpg: [don't know]: invalid packet (ctb=07)
[user@work-pub test]$ echo $?
2

Or lets use some verbosity to see what's happening inside:

[user@work-pub test]$ gpg -vv test-hacked.bin.gpg
:symkey enc packet: version 4, cipher 3, s2k 3, hash 2
salt f57aa18be9a30a1a, count 65536 (96)
gpg: CAST5 encrypted data
:encrypted data packet:
length: unknown
mdc_method: 2
gpg: encrypted with 1 passphrase
:compressed packet: algo=1
:literal data packet:
mode b (62), created 1364479642, name="test.bin",
raw data: 1048576 bytes
gpg: original file name='test.bin'
File `test-hacked.bin' exists. Overwrite? (y/N) y
:unknown packet: type 44, length 10
dump: 01 48 ab a7 78 f4 ec 0a 01 48
gpg: [don't know]: indeterminate length for invalid packet type 10
gpg: mdc_packet with invalid encoding
gpg: decryption failed: invalid packet
:encrypted data packet:
length: unknown
gpg: encrypted with 1 passphrase
gpg: assuming IDEA encrypted data
gpg: [don't know]: invalid packet (ctb=47)
gpg: decryption okay
gpg: WARNING: message was not integrity protected
:literal data packet:
mode S (53), created 1398146990,
name="\x07\xaeG\xcb\xc6\x92SV\x07\xaeG\xcb\xc6\x92SV\x07\xaeG\xcb\xc6\x92SV\x07\xaeG\xcb\xc6\x92SV\x07\xaeG\xcb\xc6\x92SV\x07\xaeG\xcb\xc6\x92SV\x07\xaeG\xcb\xc6\x92SV\x07\xaeG\xcb\xc6\x92SV\x07\xaeG\xcb\xc6\x92SV\x07\xaeG\xcb\xc6\x92SV\x07\xaeG\xcb\xc6\x92",
raw data: 1782 bytes
gpg: original file
name='�G�SV�G�SV�G�SV�G�SV�G�SV�G�SV�G�SV�G�SV�G�SV�G�SV�G�'
gpg: WARNING: multiple plaintexts seen
gpg: handle plaintext failed: unexpected data
gpg: [don't know]: invalid packet (ctb=07)
[user@work-pub test]$

Nice, huh!

So, we see that gpg first parses the blob significantly before first
verifying its integrity. This sounds like a very bad idea to me, as it
exposes many code paths in the GPG to an attack if the blob was
intentionally malformed.

So, I would suggest to use openssl dgst -hmac to verify the integrity of
the blob first, and only then to pass it to GPG for decryption. Or
perhaps we can also use openssl to handle the decryption? Doesn't sound
like a critical decision, but I think it would be more elegant to stick
to just one crypto framework -- openssl in that case.

joanna.

signature.asc

Marek Marczykowski

unread,
Mar 28, 2013, 10:16:15 AM3/28/13
to Joanna Rutkowska, qubes...@googlegroups.com, Andrew Sorensen
Actually --force-mdc does the job:

[user@testvm gpgtest]$ gpg -c --compress-level 0 --force-mdc test.bin
(...)
[user@testvm gpgtest]$ gpg test-hacked.bin.gpg
gpg: CAST5 encrypted data
gpg: encrypted with 1 passphrase
gpg: WARNING: encrypted message has been manipulated!
[user@testvm gpgtest]$ echo $?
2


[user@testvm gpgtest]$ gpg test.bin.gpg
gpg: CAST5 encrypted data
gpg: encrypted with 1 passphrase
File `test.bin' exists. Overwrite? (y/N) y
[user@testvm gpgtest]$ echo $?
0
signature.asc

Marek Marczykowski

unread,
Mar 28, 2013, 10:20:12 AM3/28/13
to Joanna Rutkowska, qubes...@googlegroups.com, Andrew Sorensen
(...)

Ok, you were first :)
signature.asc

Marek Marczykowski

unread,
Mar 28, 2013, 6:38:48 PM3/28/13
to Joanna Rutkowska, qubes...@googlegroups.com, Andrew Sorensen
One possible problem with this approach is requirement for two pass processing
the data:
1. verify HMAC
2. decrypt and further processing
This require local store the file, or download it twice, which can be bad
(check my earlier example use case). Anyway the same data will be already
parsed by openssl during HMAC verification. Do you believe in significant
difference (in terms of security) between "openssl dgst" and "openssl enc -d"?

If we assume trust in openssl, we can use "openssl dgst (...) | openssl enc -d
(...)" and check if data were correct at the end.

The other solution for this problem is split data to chunks and apply HMAC to
each chunk separately. But this will be definitely more complex.

Any ideas? Maybe feature of partial restore without sufficient disk space for
full backup is just to much effort?
signature.asc

Alex Dubois

unread,
Apr 1, 2013, 4:39:42 AM4/1/13
to qubes...@googlegroups.com
Is it because no VM should run during backup/restore that the use of dispVM is not leveraged on to process the hashing&encryption/decryption? It is the default thinking of Qubes to mitigate/limit potential attacks :-)

Alex

Joanna Rutkowska

unread,
Apr 1, 2013, 6:33:04 AM4/1/13
to qubes...@googlegroups.com, Alex Dubois
I don't see how use of a DispVM could provide any benefit in this case?
Imagine the gpg that runs in the DispVM (and which is to be used for
verification and decryption of the backup blob) gets exploited -- it can
now feed *any* backup blob (unverified, etc) to Dom0 and further
compromise the whole system -- either by exploiting the next gpg that
was to be run in Dom0 (if dom0 was to perform decryption) or, if Dom0
was to relay completely on the DispVM to do the decryption, by providing
a malicious backup blob, which, when restored, will being compromised
AppVMs, and Dom0 home even. So, we're not gaining anything by moving
verification and decryption to DispVM.

joanna.

signature.asc

Alex Dubois

unread,
Apr 2, 2013, 4:19:56 AM4/2/13
to Joanna Rutkowska, qubes...@googlegroups.com


Alex
Very true.

I am a bit stubborn like you... What about hash+OpenSSL in dispVM and gpg in Dom0 (encapsulated)... Easy to suggest but I am yet to provide any patch...

> joanna.
>

Joanna Rutkowska

unread,
Apr 2, 2013, 4:42:16 AM4/2/13
to Alex Dubois, qubes...@googlegroups.com
I don't think you get it :/

There is no point in doing any critical crypto in DispVM. If whatever
verification you wanted to do in DispVM is to be considered safe (as
e.g. OpenSSL's dgst) then there is no benefit of doing it in DispVM
(because it it is "safe" then we can do it in Dom0 as well). If, on the
other hand, we assume it is not safe, then there is no benefit of doing
it in DispVM either -- because this will not protect our Dom0 from
ultimately getting malformed, unverified data.

joanna.

signature.asc

Joanna Rutkowska

unread,
Apr 2, 2013, 7:48:23 PM4/2/13
to Alex Dubois, qubes...@googlegroups.com
[Adding back the list]

On 04/02/13 19:25, Alex Dubois wrote:
>
>
> Alex
>
> On 2 Apr 2013, at 09:42, Joanna Rutkowska
> Unless you try to mitigate a risk by lowering its threat probability.
> OpenSSL digest in one dispVM and gpg digest in a second as the
> probability of both being unsafe (very small) is the product.
>
The above statement is not true, for the reasons I outlined before.

> This is if you feel computing a digest is unsafe. I though I could
> trust such function until you raised the point.
>
That's an incorrect interpretation of what I wrote before. I wrote that
the way how gpg parses the incoming file, *before* actually veryfing its
digest, is insecure.

joanna.

signature.asc

Andrew Sorensen

unread,
Jul 7, 2013, 9:38:24 PM7/7/13
to qubes...@googlegroups.com
Brief status update:

I have backups (with encryption or compression) to AppVM working. I still need to make adjustments to restore the "backup to a local directory" and "backup without compression or encryption" functions working, in addition to restoration.

The source is available here: https://github.com/AndrewX192/qubes-core

When I finish re-adding the existing backup functionality and add a system to restore the backup I will submit a more formal patch.


On Wednesday, March 27, 2013 11:36:35 AM UTC-7, Andrew Sorensen wrote:
I've been working on additions to qvm-core/qubesutils.py to add functionality for compressing files in Dom0 and sending them to an appvm. Right now, I'm just using a hacked up version of qvm-backup to perform the backups, but eventually I want to make that compatible with the new system

My to do the actual copy looks something like this: 
        compressor = subprocess.Popen (["tar", "-PcOz", file["path"]], stdout=subprocess.PIPE)
        subprocess.Popen (["qvm-run", "--pass-io", "-p", appvm, "cat > " + dest_dir + file["basename"] + ".tar.gz"], stdin=compressor.stdout)

(There's an issue with this code: multiple file backups running at the same time, I have yet to fix that).

Here's some questions that would be interesting to get thoughts on:

Should backups to Dom0 be supported at all? (or should it require a working AppVM?) - What about restore?

How should encryption of backups be handled? (Dom0 sounds the best, but how should the user provide their key?) - What about users who create a "backups" appvm and use LUKS on their drive?

How should the command line qvm-backup command work with this additional functionality to backup into an appvm? qvm-backup backups:/mnt/removable/qubes-backups maybe?

Joanna Rutkowska

unread,
Aug 1, 2013, 10:23:12 AM8/1/13
to qubes...@googlegroups.com, Andrew Sorensen, Marek Marczykowski
So, one suggestion we recently discussed with Marek:

First, use qubes services instead of qvm-run. This will nicely allow to
setup policy rules regarding who can request the service (e.g. Dom0 only).

Also, take a string in qvm-backup that would specify the target
directory in the dest VMs filesystem where the backup should be stored,
e.g.:

dom0$ qvm-backup --remote-storage backupvm:/mnt/my-nas-storage

Alternaively take the name of the program to run there, e.g. s3put.

So this string should be passed to the service before the rest of the
string. Should make this really flexible and powerful.

joanna.
signature.asc

Marek Marczykowski-Górecki

unread,
Aug 1, 2013, 2:10:53 PM8/1/13
to Joanna Rutkowska, qubes...@googlegroups.com, Andrew Sorensen
On 01.08.2013 16:23, Joanna Rutkowska wrote:
> So, one suggestion we recently discussed with Marek:
>
> First, use qubes services instead of qvm-run. This will nicely allow to
> setup policy rules regarding who can request the service (e.g. Dom0 only).

Just to clarify: calling qubes service from dom0 is just (qvm-)run special
command "QUBESRPC <service name>".
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

signature.asc

Joanna Rutkowska

unread,
Aug 1, 2013, 4:27:19 PM8/1/13
to Marek Marczykowski-Górecki, qubes...@googlegroups.com, Andrew Sorensen
On 08/01/13 20:10, Marek Marczykowski-Górecki wrote:
> On 01.08.2013 16:23, Joanna Rutkowska wrote:
>> So, one suggestion we recently discussed with Marek:
>>
>> First, use qubes services instead of qvm-run. This will nicely allow to
>> setup policy rules regarding who can request the service (e.g. Dom0 only).
>
> Just to clarify: calling qubes service from dom0 is just (qvm-)run special
> command "QUBESRPC <service name>".
>
Well, it is also adding a few helper files (the one with service program
in the VM, and a policy definition in Dom0).

j.
signature.asc

Olivier Médoc

unread,
Aug 8, 2013, 4:56:43 AM8/8/13
to qubes...@googlegroups.com
Hello,

I'm waiting for this new backup feature because I'm sharing new
templates with collegues using SSH through a VM. I'm currently using an
ugly bash script from dom0, and it is not user friendly at all.

Do you need additionnal support for this ticket ?

Olivier

Andrew Sorensen

unread,
Aug 10, 2013, 3:16:27 PM8/10/13
to qubes...@googlegroups.com, Olivier Médoc
I have changes that implement the backup system but not the restore
system: https://github.com/AndrewX192/qubes-core

Marek Marczykowski-Górecki

unread,
Aug 11, 2013, 10:30:02 PM8/11/13
to Andrew Sorensen, qubes...@googlegroups.com, Olivier Médoc
On 10.08.2013 21:16, Andrew Sorensen wrote:
I've looked briefly at your code and have some comments on backup format.
You create backup as directory full of files (same as the current backup
format), each encrypted separately. The user will prompted for the password
for each such file separately, which is obvious inconvenience. Also having
backup as multiple files makes it difficult to send it directly to some
external service (eg. via ncftpput). Also such multiple files backup discloses
VMs names even if the VMs data is encrypted.

IMHO good solution for all of above problems would be to create backup as one
binary blob, then encrypt it with one gpg process (on the fly). This "one
binary blob" can be single tar archive (the command to create it would be
quite long).

Restore of such one-file backup could be somehow tricky. Its all about disk
space: if you have 128GB SSD disk, and your Qubes VMs weight 100GB, you will
not have space to store full backup *and* just restored system. So restore
should be done on the flight.
The second problem is partial restore (only selected VMs): first you need to
retrieve list of VMs, then the data - only of selected VMs. The common use
case for it would be to restore the data, skipping netvm and firewallvm (which
would exists after system install). But the question is if we want to support
such scenarios in connection with external backups. I think we can answer "no"
here - so if the user want partial restore, he/she needs to provide backup on
some regular local storage connected to dom0. Still some generic settings
could apply, like "Ignore already existing VMs".

So the backup would be something like (python-based pseudocode based on your
code):
tar_cmdline = ["tar", "cS", "-C", "/var/lib/qubes"]
for file in files_to_backup:
tar_cmdline.append(file["path"].strip("/var/lib/qubes/"))
retcode = vm.run(command = "cat > {0}".format(dest_dir + "backup.gpg"),
passio_popen = True)
compressor = subprocess.Popen (["gpg", "-ac", "--force-mdc", "-o-"],
stdout=retcode.stdin)
archiver = subprocess.Popen(tar_cmdline, stdout=compressor.stdin)

"-C" option could be also useful to change current directory of tar _between_
some files (perhaps dom0 home backup case).

The restore part: in this case it is hard extract first qubes.xml, then (only
selected) backup data - so no partial restore. But maybe it isn't problem?
Maybe backup can be extracted as the whole to some directory, then each
directory *move* to the right place? This will be easy to implement based on
the current code: you need only add some code to retrieve full backup (fetch,
decrypt, unpack) from some VM and store it in local directory. Then call
original code on that directory.
signature.asc

Andrew Sorensen

unread,
Aug 12, 2013, 1:53:40 AM8/12/13
to Marek Marczykowski-Górecki, qubes...@googlegroups.com, Olivier Médoc
On 08/11/13 19:30, Marek Marczykowski-Górecki wrote:
> On 10.08.2013 21:16, Andrew Sorensen wrote:
>> On 08/08/13 01:56, Olivier Médoc wrote:
>>> Hello,
>>>
>>> I'm waiting for this new backup feature because I'm sharing new
>>> templates with collegues using SSH through a VM. I'm currently using
>>> an ugly bash script from dom0, and it is not user friendly at all.
>>>
>>> Do you need additionnal support for this ticket ?
>> I have changes that implement the backup system but not the restore
>> system: https://github.com/AndrewX192/qubes-core
> I've looked briefly at your code and have some comments on backup format.
> You create backup as directory full of files (same as the current backup
> format), each encrypted separately. The user will prompted for the password
> for each such file separately, which is obvious inconvenience. Also having
> backup as multiple files makes it difficult to send it directly to some
> external service (eg. via ncftpput). Also such multiple files backup discloses
> VMs names even if the VMs data is encrypted.
I think we can address the issue of the user needing to provide their
password multiple times with gpg-agent. I just need to make sure the
agent is started before the backup process is started and that the gpg
instances spawned during backup have access to the current instance of
the agent.
Despite the issues of information disclosure by encrypting each vm
separately, I think we'd be putting some rather unfortunate limitations
on the user if the backup system only natively supports backing up to a
single file. It really depends on what the user expects to use the
backup system for (e.g. disaster recovery or accidental file deletions).

Regardless, I'm not sure what do with qubes.xml - if it is expected that
a user will do partial restores, replacing qubes.xml could cause appvms
not part of the backup to disappear from the qubes-manager listing.

Olivier Médoc

unread,
Aug 12, 2013, 6:08:05 AM8/12/13
to qubes...@googlegroups.com
Extracting information from a xml file is really straigtforward in python:
import xml.dom.minidom
from xml.dom import Node

dom = xml.dom.minidom.parse(filename)
for vmnode in dom.getElementsByTagName('QubesTemplateVm'):
if vmnode.getAttribute('name') == 'fedora-18-x64':
print "VM xml node",vmnode
for vmnode in dom.getElementsByTagName('QubesAppVm'):
pass
for vmnode in dom.getElementsByTagName('QubesHVm'):
pass
...
You can also use qubes high level tools to load the collection from xml:
backup_collection = QubesVmCollection(store_filename=yourbackupxmlfile)
backup_collection = lock_db_for_reading() # Optionnal ?
backup_collection.load()

Check in qubesutils.py how the select and insert only the required VM
metadata.






Marek Marczykowski-Górecki

unread,
Aug 12, 2013, 6:56:06 AM8/12/13
to Andrew Sorensen, qubes...@googlegroups.com, Olivier Médoc
On 12.08.2013 07:53, Andrew Sorensen wrote:
> On 08/11/13 19:30, Marek Marczykowski-Górecki wrote:
>> On 10.08.2013 21:16, Andrew Sorensen wrote:
>>> On 08/08/13 01:56, Olivier Médoc wrote:
>>>> Hello,
>>>>
>>>> I'm waiting for this new backup feature because I'm sharing new
>>>> templates with collegues using SSH through a VM. I'm currently using
>>>> an ugly bash script from dom0, and it is not user friendly at all.
>>>>
>>>> Do you need additionnal support for this ticket ?
>>> I have changes that implement the backup system but not the restore
>>> system: https://github.com/AndrewX192/qubes-core
>> I've looked briefly at your code and have some comments on backup format.
>> You create backup as directory full of files (same as the current backup
>> format), each encrypted separately. The user will prompted for the password
>> for each such file separately, which is obvious inconvenience. Also having
>> backup as multiple files makes it difficult to send it directly to some
>> external service (eg. via ncftpput). Also such multiple files backup discloses
>> VMs names even if the VMs data is encrypted.
> I think we can address the issue of the user needing to provide their
> password multiple times with gpg-agent. I just need to make sure the
> agent is started before the backup process is started and that the gpg
> instances spawned during backup have access to the current instance of
> the agent.

I'm not sure if gpg-agent can be used for symmetric encryption key cache...
Really, encrypted single tar archive is very simple backup format. It is even
easier to handle manually (some disaster recovery case) than a bunch of
separately encrypted archives.

> Regardless, I'm not sure what do with qubes.xml - if it is expected that
> a user will do partial restores, replacing qubes.xml could cause appvms
> not part of the backup to disappear from the qubes-manager listing.

Olivier already answered this - you can use QubesVmCollection to load
arbitrary qubes.xml file.
signature.asc

Olivier Médoc

unread,
Aug 12, 2013, 8:50:50 AM8/12/13
to qubes...@googlegroups.com
Just for documentation purpose, here are the backup use cases I found / use:

Use case A: Full system restore because of a crash: everything has to be
restored except NetVM and FirewallVMs that are already there.

Use case B: Full system restore, but I still want to restore my NetVM
because my Wireless passwords are stored within. And a VPN access is
probably a similar issue.

Use case C: Restoration of data because of a loss in an AppVM home, or
because of a broken / screwed Template. I wan't to rename my current VM
and restore an old version in order to access to the data. Or maybe
replace it completely.

Use case D: Backup of VMs that are used as Templates, or specific HVMs.
For example, in my case, prepared Windows VMs. In this case, I want to
restore only one VM.

Possible solutions for use cases B-C-D:
Solution 1 :Only the VMs that have specific names can be stored
temporarily in dom0 so that a second pass restoration can occur.
Solution 2 :Some of the VMs need to be splitted, maybe on user demand
during backup.
Solution 3 :To simplify the code of Solution 2, a new backup process can
be initiated for each VM that need to be splitted.
Solution 4 :Users that want these specific use-cases take care of the
VMs they backup and can create backups for single VMs.
Solution 5 :Implementation of a VM Template library mecanism (a new GUI)
that (re)use qubesutils.py and qvm-backup code, backing-up some VMs
separatly.
Solution 6 :Implement a solution allowing on-the-fly
decryption/verification (is on-the-fly verification possible with gpg?),
then select VMs to be restored based on the .xml file (which would be
stored in the beginning of the binary blob), then download again the
full binary blob, extracting only the required VMs. Well, this solution
looks ugly...

Of course, Use case A is the most common one, but I'm ready to implement
something for use case D even if it is for myself (in fact, I have
something with qvm-run|ssh|unencrypted which is not user friendly at all).

I don't see any good solution that solve all the use cases. Maybe you
have some idea ?

Olivier Médoc

unread,
Aug 13, 2013, 5:38:28 AM8/13/13
to qubes...@googlegroups.com
On 08/12/13 07:53, Andrew Sorensen wrote:
> On 08/11/13 19:30, Marek Marczykowski-G�recki wrote:
>> On 10.08.2013 21:16, Andrew Sorensen wrote:
Hello,

I tested your backup code. Some notes about optimisations:
- Use gpg -c alone. This way, you will reduce the size of the backup by
using binary data instead of ASCII armor
- Use the tar option --sparse. It will reduce by a factor of two the
time to perform a backup. (in fact the -S option proposed by Marek

Also, I tried to use two Popen calls one for tar and one for gpg2 as
proposed by Marek, linking STDINs to STDOUTs.

I can send you a patch if you are interested. Just say if you are
already working on it so that I don't do it for nothing.

Olivier

Olivier Médoc

unread,
Aug 13, 2013, 11:53:51 AM8/13/13
to qubes...@googlegroups.com
On 08/01/13 22:27, Joanna Rutkowska wrote:
> On 08/01/13 20:10, Marek Marczykowski-G�recki wrote:
>> On 01.08.2013 16:23, Joanna Rutkowska wrote:
>>> So, one suggestion we recently discussed with Marek:
>>>
>>> First, use qubes services instead of qvm-run. This will nicely allow to
>>> setup policy rules regarding who can request the service (e.g. Dom0 only).
>> Just to clarify: calling qubes service from dom0 is just (qvm-)run special
>> command "QUBESRPC <service name>".
>>
> Well, it is also adding a few helper files (the one with service program
> in the VM, and a policy definition in Dom0).
>
> j.
Playing with QUBESRPC, I have a question:

You can apparently only pass a single argument to the QUBESRPC call
(which is normally source VM ?). How can you pass an argument without
using STDIN (the command you want to run, or the destination of the
backup). Because you then have th send the backup using STDIN.

Or maybe by using some synchronization rule ? (eg: readline once for the
arguments, the rest passing to stdout).

Marek Marczykowski-Górecki

unread,
Aug 13, 2013, 12:05:10 PM8/13/13
to qubes...@googlegroups.com, Olivier Médoc
On 13.08.2013 17:53, Olivier Médoc wrote:
> On 08/01/13 22:27, Joanna Rutkowska wrote:
>> On 08/01/13 20:10, Marek Marczykowski-Górecki wrote:
>>> On 01.08.2013 16:23, Joanna Rutkowska wrote:
>>>> So, one suggestion we recently discussed with Marek:
>>>>
>>>> First, use qubes services instead of qvm-run. This will nicely allow to
>>>> setup policy rules regarding who can request the service (e.g. Dom0 only).
>>> Just to clarify: calling qubes service from dom0 is just (qvm-)run special
>>> command "QUBESRPC <service name>".
>>>
>> Well, it is also adding a few helper files (the one with service program
>> in the VM, and a policy definition in Dom0).
>>
>> j.
> Playing with QUBESRPC, I have a question:
>
> You can apparently only pass a single argument to the QUBESRPC call (which is
> normally source VM ?). How can you pass an argument without using STDIN (the
> command you want to run, or the destination of the backup). Because you then
> have th send the backup using STDIN.

Better don't do that - leave real source domain as this parameter.

> Or maybe by using some synchronization rule ? (eg: readline once for the
> arguments, the rest passing to stdout).

This is way to go, sth like:
sh -c 'read theparameter; exec destination-program $theparameter'
signature.asc

Olivier Médoc

unread,
Aug 13, 2013, 12:15:48 PM8/13/13
to qubes...@googlegroups.com
On 08/13/13 18:05, Marek Marczykowski-Górecki wrote:
> On 13.08.2013 17:53, Olivier Médoc wrote:
>> On 08/01/13 22:27, Joanna Rutkowska wrote:
>>> On 08/01/13 20:10, Marek Marczykowski-Górecki wrote:
>>>> On 01.08.2013 16:23, Joanna Rutkowska wrote:
>>>>> So, one suggestion we recently discussed with Marek:
>>>>>
>>>>> First, use qubes services instead of qvm-run. This will nicely allow to
>>>>> setup policy rules regarding who can request the service (e.g. Dom0 only).
>>>> Just to clarify: calling qubes service from dom0 is just (qvm-)run special
>>>> command "QUBESRPC <service name>".
>>>>
>>> Well, it is also adding a few helper files (the one with service program
>>> in the VM, and a policy definition in Dom0).
>>>
>>> j.
>> Playing with QUBESRPC, I have a question:
>>
>> You can apparently only pass a single argument to the QUBESRPC call (which is
>> normally source VM ?). How can you pass an argument without using STDIN (the
>> command you want to run, or the destination of the backup). Because you then
>> have th send the backup using STDIN.
> Better don't do that - leave real source domain as this parameter.
>
>> Or maybe by using some synchronization rule ? (eg: readline once for the
>> arguments, the rest passing to stdout).
> This is way to go, sth like:
> sh -c 'read theparameter; exec destination-program $theparameter'
Good, I got backup working :)

I go home and check later if it works by sending a ssh command instead
of a directory, but it should work.

- single tarball piped optionally to gpg2 with the optimisations I
discussed earlier
- usage of QUBESRPC
- feedback based on tar --checkpoint piped to a temporary file, and %
computations
- possible to run a command inside the VM instead of the directory
target (coded but to be tested)
- possible to run a backup to dom0 instead of a VM (coded but to be tested)

Olivier Médoc

unread,
Aug 14, 2013, 5:20:20 AM8/14/13
to qubes...@googlegroups.com
Hello,

Please find attached the patches of the features discussed below. These
patches are based on Andrew's repository.

So far, I tested:
- Backup Error handling
- Backup progress feedback
- Backup to dom0
- Backup to a VM directory
- Backup through a VM tool: tested with ssh using multiple arguments and
quotes

So far, the following things are missing:
- Qubes service policies for qubes.Backup (is it normal that it is
accepted by default even if I didn't created any policy file ?)
- Restore from a VM
- Partial restore / Backup testing mecanism (to be defined, could be
pre-restore, then selection of VMs to be restored, I will try several
approaches). Can be achieved by extracting only qubes.xml which is in
the first kbytes of the tar file (it also works if it is gpg encrypted).
One issue is that qubes.xml contains all the VMs, not the only one that
have been backuped.
- GUI backup/restore adaptation to appvm or encryption selection.


On 08/13/13 18:15, Olivier M�doc wrote:
> On 08/13/13 18:05, Marek Marczykowski-G�recki wrote:
>> On 13.08.2013 17:53, Olivier M�doc wrote:
>>> On 08/01/13 22:27, Joanna Rutkowska wrote:
0001-backup-implemented-use-of-tar-gpg2-instead-of-only-e.patch
0002-backup-improved-performance-by-optimizing-tar-and-gp.patch
0003-backup-implemented-use-of-a-single-tar-file-instead-.patch
0004-backup-implemented-progress-feedback-using-tar-check.patch
0005-backup-major-revamp-of-the-backup-code-to-include-ba.patch

Marek Marczykowski-Górecki

unread,
Aug 14, 2013, 5:36:11 AM8/14/13
to qubes...@googlegroups.com, Olivier Médoc
On 14.08.2013 11:20, Olivier Médoc wrote:
> Hello,
>
> Please find attached the patches of the features discussed below. These
> patches are based on Andrew's repository.
>
> So far, I tested:
> - Backup Error handling
> - Backup progress feedback
> - Backup to dom0
> - Backup to a VM directory
> - Backup through a VM tool: tested with ssh using multiple arguments and quotes
>
> So far, the following things are missing:
> - Qubes service policies for qubes.Backup (is it normal that it is accepted by
> default even if I didn't created any policy file ?)

Actually services called _from dom0_ are always allowed (no policy even checked).

> - Restore from a VM
> - Partial restore / Backup testing mecanism (to be defined, could be
> pre-restore, then selection of VMs to be restored, I will try several
> approaches). Can be achieved by extracting only qubes.xml which is in the
> first kbytes of the tar file (it also works if it is gpg encrypted). One issue
> is that qubes.xml contains all the VMs, not the only one that have been backuped.

Perhaps backup could contain one additional file (at the beginning) with only
list of VMs?
qubes.xml needs to contain (almost) all VMs because of dependencies - if you
backup VM connected to "firewallvm" and based on "fedora-18-x64" template,
qubes.xml must contain also those dependent VMs.

> - GUI backup/restore adaptation to appvm or encryption selection.

So currently to restore such backup one needs to manually download, decrypt
and unpack the archive, right?
signature.asc

Joanna Rutkowska

unread,
Aug 14, 2013, 9:37:31 AM8/14/13
to qubes...@googlegroups.com, Olivier Médoc
On 08/14/13 11:20, Olivier Médoc wrote:
> So far, the following things are missing:
> - Qubes service policies for qubes.Backup (is it normal that it is
> accepted by default even if I didn't created any policy file ?)
> - Restore from a VM
> - Partial restore / Backup testing mecanism (to be defined, could be
> pre-restore, then selection of VMs to be restored, I will try several
> approaches). Can be achieved by extracting only qubes.xml which is in
> the first kbytes of the tar file (it also works if it is gpg encrypted).
> One issue is that qubes.xml contains all the VMs, not the only one that
> have been backuped.

Hm, perhaps we could, when making a backup, re-create the qubes.xml on
the fly so that it only referred the VM's that are in the backup, and
perhaps only the VMs on which those depend (netvms, templates). This way
one could make a backup of just some of the VMs, and hand those to
somebody else, not fearing of also disclosing the names of other VMs and
generally structure of their partitioning...?

This is, however, a separate issue from this ticket, I think.

joanna.

signature.asc

Joanna Rutkowska

unread,
Aug 14, 2013, 11:35:39 AM8/14/13
to qubes...@googlegroups.com, Olivier Médoc
In order to restore only select VMs from a backup that is severed as a
stream from another VM (our case here):

1) We send (encrypted) qubes.xml first, plus signature,
2) The qvm-backup-restore, after receiving this file and verifying the
signature offers a choice to the user which VMs to decrypt (ok, as this
is a command line tool, it should just display the ascii-art table like
it is doing now, and a Y/N prompt -- if the user wanted to exclude some
of the VMs, the tools would be restart with exceptions given as -x
options, as it is now).
3) The qvm-backup-restore continues to receive following files that were
recorded as part of the backup (it doesn't explicitly inform the VM that
it wants to continue receiving it -- just continues to do read() from
the descriptor). In case the files are for a VM that is *not* to be
restored, then this stream of bytes goes into /dev/null, otherwise it is
being stored in Dom0 tmp, then verified and then mv-ed to specific
directory under /var/lib/qubes/.

So all we need is that our restoring VM provides us with a stream
similar to what our qvm-copy-to-vm does. Except that we also need
signatures (HMACs) of some sort for each of those files.

For simplicity, I think there should be no additional Dom0->VM
communication (and then VM->Dom0 responses) -- the VM always servers the
same stream, just that some parts of it go into /dev/null on Dom0 side.

joanna.

signature.asc

Joanna Rutkowska

unread,
Aug 14, 2013, 11:39:41 AM8/14/13
to qubes...@googlegroups.com, Olivier Médoc
So, just to clarify, it's all about preparing the original blob stream
in a smart way (header, qubes.xml, hmac, header, file-XXX, hmac, etc).
This should be done by qvm-backup, of course. Perhaps the qubes.xml
could be re-created on the fly, as I discussed in another message, to
refer to only the actual VMs that are part of the backup (plus deps).
The agent in the VM that serves this back to Dom0 is stupid -- it just
writes this all stream to the descriptor.

joanna.

signature.asc

Marek Marczykowski-Górecki

unread,
Aug 14, 2013, 1:36:55 PM8/14/13
to qubes...@googlegroups.com, Joanna Rutkowska, Olivier Médoc
I'm quite against inventing new "file format" for that. For simple reason: to
be able to access the data without full Qubes system. In fact backups are for
recovery situations.

So perhaps still use tar archive, but place files in the right order - as
proposed by Joanna. The question is if it is possible to "pause" unpacking
files right after qubes.xml+hmac and wait for user confirmation.
signature.asc

Joanna Rutkowska

unread,
Aug 14, 2013, 3:43:49 PM8/14/13
to Marek Marczykowski-Górecki, qubes...@googlegroups.com, Olivier Médoc
Ok, so tar with files sequenced like this:

1) qubes.xml
2) qubes.xml's hmac
3) file1
4) file1's hmac
5) etc.

>
> So perhaps still use tar archive, but place files in the right order - as
> proposed by Joanna. The question is if it is possible to "pause" unpacking
> files right after qubes.xml+hmac and wait for user confirmation.
>

The vchan should support it, no? I.e. when the receiver stops read()ing
from the descriptor?

j.

signature.asc

Marek Marczykowski-Górecki

unread,
Aug 14, 2013, 3:50:54 PM8/14/13
to Joanna Rutkowska, qubes...@googlegroups.com, Olivier Médoc
Yes. The question is how to hold tar process (and when).
signature.asc

Joanna Rutkowska

unread,
Aug 14, 2013, 4:03:57 PM8/14/13
to Marek Marczykowski-Górecki, qubes...@googlegroups.com, Olivier Médoc
Perhaps we can start tar twice:
1) > head -n XXX | tar x | ... (plus, save the original stream as header)
2) replay the header, then continue reading the socket, and pipe through
tar again.

?