DRBD, LVM and snapshots

667 views
Skip to first unread message

Roberto Espinoza

unread,
May 13, 2014, 5:27:28 AM5/13/14
to gan...@googlegroups.com
Hello everyone,

I know Ganeti doesn't support to do snapshots on LVM so I am trying to do that manually but I am a bit confused.

Currently we are doing backups using "dd" and the DRBD devices for an instance, of course we have to shutdown and experience downtime until the "dd" copy is done which is a few minutes (small disk sizes).

Anyway, I was wondering if it is possible to shutdown the instance and then manually create snapshots for the data disks and then starting the instance again.

My questions are... 

Do I have to do the same for the meta disk?
When restoring the data, can I just "dd" the image directly to their respective LVMs ? 

I can understand how the "dd" being done directly to the /dev/drbdXX can keep everything in sync but I have the feeling it will be different if I start writing over the LVM for the data (and meta if needed).
I was thinking that maybe I have to do the "dd" restore on both nodes for a recovery.

Anyhow, in short I just want to be able to snapshot the LVMs so I can start the instance right away while I backup. Downtime is acceptable if short and certainly being able to do this will reduce our 5min downtime to just < 30sec or so.


Thank you again.

Neal Oakey

unread,
May 13, 2014, 7:02:47 AM5/13/14
to gan...@googlegroups.com
Hi,

see in-line

Greetings,
Neal

Am 13.05.2014 11:27, schrieb Roberto Espinoza:
> Hello everyone,
>
> I know Ganeti doesn't support to do snapshots on LVM so I am trying to
> do that manually but I am a bit confused.
how do you mean this?
>
> Currently we are doing backups using "dd" and the DRBD devices for an
> instance, of course we have to shutdown and experience downtime until
> the "dd" copy is done which is a few minutes (small disk sizes).
if you want to backup VMs just use gnt-backup export -n <target_node>
[opts...] <name>

How this Backup works depends on the template-type which you have used.
For ext[2-4] debootstrap+default should do, but if you are using LVM or
other filesystems inside the VM you will need raw-image
(https://code.google.com/p/ganeti/wiki/OSDefinitions)
>
> Anyway, I was wondering if it is possible to shutdown the instance and
> then manually create snapshots for the data disks and then starting
> the instance again.
>
> My questions are...
>
> Do I have to do the same for the meta disk?
> When restoring the data, can I just "dd" the image directly to their
> respective LVMs ?
The Meta-disk only contains information for / from drbd, you won't need
that if you export the VM with gnt-backup
>
> I can understand how the "dd" being done directly to the /dev/drbdXX
> can keep everything in sync but I have the feeling it will be
> different if I start writing over the LVM for the data (and meta if
> needed).
> I was thinking that maybe I have to do the "dd" restore on both nodes
> for a recovery.
>
> Anyhow, in short I just want to be able to snapshot the LVMs so I can
> start the instance right away while I backup. Downtime is acceptable
> if short and certainly being able to do this will reduce our 5min
> downtime to just < 30sec or so.
>
Downtime with gnt-backup ist only one reboot (for more details check the
docs)
>
> Thank you again.

Roberto Espinoza

unread,
May 13, 2014, 8:41:36 AM5/13/14
to gan...@googlegroups.com

Thank you for your reply.

Yes, I'm stuffing LVM inside the instances, that's why I can't use the default scripts for the backup.

I'm trying to create a raw image.

Right now I can do that with the same downtime as gnt-backup (shutdown, backup, start).

That's why I'm looking to create a read only snapshot of the LVM used by drbd, in this case.

Shutdown, LVM snapshot, start, backup.

My question is, can I safely create the snapshot of the data LVM ? (no Metadata LVM)

Second, to restore, do I have to activate the disks and then dd the image to the LVM in the primary node? Does it work or do I have to restore in both nodes?

Thank you.

Neal Oakey

unread,
May 13, 2014, 8:51:43 AM5/13/14
to gan...@googlegroups.com
Hi,

as far as i know Ganeti does an LVM snapshot,
don't know if that has been added with a special version but with 2.9 it does so.

To the other questions:
Just use gnt-backup - that should to all you need ;)

Note: gnt-backup import will (re?)create a new VM based on the image and the settings from the config file.

Greetings,
Neal

Roberto Espinoza

unread,
May 13, 2014, 11:26:23 AM5/13/14
to gan...@googlegroups.com

Does the script from 2.9 support LVM in guests?

I think that's my biggest problem right now.

I'm running ganeti 2.5

Thank you

candlerb

unread,
May 14, 2014, 12:57:53 PM5/14/14
to gan...@googlegroups.com
On Tuesday, 13 May 2014 16:26:23 UTC+1, rreg wrote:

Does the script from 2.9 support LVM in guests?

If you use ganeti-instance-image: then setting NOMOUNT=yes in your variants file will cause it to do a raw, block-by-block export of the volume, in qcow2 format.

The way I have it set up is:

==> /etc/ganeti/instance-image/variants/cd.conf <==
CDINSTALL="yes"
NOMOUNT="yes"

==> /etc/ganeti/instance-image/variants.list <==
default
cd

Then any image installed with -o image+cd gets no partitioning (it assumes the partitioning is done by an installer ISO) *and* exports as a qcow2 dump. This should work with any sort of data, including LVM, although when you restore it would have to restore to an image of exactly the same size.

Now, I think you said you were using ganeti-instance-debootstrap, which doesn't have this option. But the export scripts are very, very simple:

cat /usr/share/ganeti/os/debootstrap/export
cat /usr/share/ganeti/os/image/export

so with a little work you should be able to take the NOMOUNT functionality from image/export and merge it into debootstrap/export.

Alternatively, I believe it's possible to gnt-instance modify to change the os-type of an already-installed image to image+cd, then you could just export. But you'd lose the ability to do a reinstall of that instance.

Also note: when you have a cached image, the debootstrap installer does little more than untar it (see /usr/share/ganeti/os/debootstrap/create). You can actually use instance-image to untar an image. However instance-image assumes that you want to create a system with two or three partitions: either boot and root; or boot, swap and root.

Regards,

Brian.

P.S. With a bit of investment of time, someone could write export/import which understands LVM and dumps and restores each of the logical volumes separately. This should be easier when the scripts can run inside the guest:

Roberto Espinoza

unread,
May 14, 2014, 9:51:59 PM5/14/14
to gan...@googlegroups.com
Hello Brian,

Thank you for your insight.

Then I will tackle on the task to modify the scripts for the debootstrap to be able to support LVM as this is a common setup on the instances I manage as we use the LVM volume to grow it to our needs for databases, logs, etc. but we still keep the normal partitions from the create script for the base OS.

Once I am done I will share it here but I believe I have to sign a contributor agreement and that requires some paperwork at my company.


Regards,
Roberto

candlerb

unread,
May 15, 2014, 5:23:19 AM5/15/14
to gan...@googlegroups.com
On Thursday, 15 May 2014 02:51:59 UTC+1, rreg wrote:
Hello Brian,

Thank you for your insight.

Then I will tackle on the task to modify the scripts for the debootstrap to be able to support LVM as this is a common setup on the instances I manage as we use the LVM volume to grow it to our needs for databases, logs, etc.

OK.

But before you do too much work on this, remember in a virtualized environment you may be able to avoid LVM in the guest, as there is already LVM in the underlying host.

What I mean is, instead of creating (say) three logical volumes in the guest, you can create a VM with three disks, which appear as /dev/vda (root), /dev/vdb (for database files), /dev/vdc (for logs). Each of these maps to a logical volume in the host - or DRBD over logical volume - but the guest need not be aware of that.

When you want to grow a volume, then in the host you can do
gnt-instance grow-disk
which grows the logical volume representing one of these virtual disks.

Reboot the guest, do resize2fs inside the guest, and you're done. Make your filesystems directly on the disks (/dev/vdb) not in a partition (/dev/vdb1) to make this easy.

Roberto Espinoza

unread,
May 15, 2014, 6:38:09 AM5/15/14
to gan...@googlegroups.com

I can just resize like that?
I'm doing that but for the LVM volumes. I remember trying once doing the grow disk but I couldn't do it or maybe I did it wrong.

I'll try it again because if that works, I can actually get rid of the LVM volumes, that's the only reason why we use them.

Thanks again.

candlerb

unread,
May 15, 2014, 10:49:49 AM5/15/14
to gan...@googlegroups.com
On Thursday, 15 May 2014 11:38:09 UTC+1, rreg wrote:

I can just resize like that?

Sure :-)

If you increase the size of, say, the second disk (which ganeti calls disk 1, and your guest will see as /dev/vdb), then you'll need a restart for the guest to pick up the larger disk, and then

    resize2fs /dev/vdb

will grow the filesystem. Again, to be clear: this is if the filesystem is written directly to /dev/vdb, not to /dev/vdb1.

Actually, there's a hack which lets you notify the guest of the increased disk size without even needing a restart. This is something that Phil Regnauld pointed me to, and is documented in issue 258.

Just testing it now:

* inside guest

root@wheezy21:~# blockdev --getsize64 /dev/vda
1073741824

* on ganeti master node

root@host19:~# gnt-instance grow-disk wheezy21 0 1G
Thu May 15 14:26:44 2014 Growing disk 0 of instance 'wheezy21' by 1.0G to 2.0G
Thu May 15 14:26:45 2014  - INFO: Waiting for instance wheezy21 to sync disks
Thu May 15 14:26:45 2014  - INFO: Instance wheezy21's disks are in sync

* in guest, nothing has changed yet

root@wheezy21:~# blockdev --getsize64 /dev/vda
1073741824

* on the node where the guest is running

root@host21:~# echo "info block" | /usr/bin/socat STDIO UNIX-CONNECT:/var/run/ganeti/kvm-hypervisor/ctrl/wheezy21.monitor
QEMU 2.0.0 monitor - type 'help' for more information
(qemu) info block
hotdisk-aae73dcc-pci-4: /var/run/ganeti/instance-disks/wheezy21:0 (raw)

ide1-cd0: [not inserted]
    Removable device: not locked, tray closed

floppy0: [not inserted]
    Removable device: not locked, tray closed

sd0: [not inserted]
    Removable device: not locked, tray closed
(qemu)
root@host21:~# echo "block_resize hotdisk-aae73dcc-pci-4 0G" | /usr/bin/socat STDIO UNIX-CONNECT:/var/run/ganeti/kvm-hypervisor/ctrl/wheezy21.monitor
QEMU 2.0.0 monitor - type 'help' for more information
(qemu) block_resize hotdisk-aae73dcc-pci-4 0G
(qemu) 
root@host21:~#

* back in guest

root@wheezy21:~# blockdev --getsize64 /dev/vda

Yay :-)

Roberto Espinoza

unread,
May 15, 2014, 11:28:34 AM5/15/14
to gan...@googlegroups.com

I'll try this tomorrow, if it works, I'll be converting my LVM guests to regular instances.

Hopefully this is useful for someone else, I'll post some results tomorrow.

Regards,
Roberto

Karsten Heymann

unread,
May 22, 2014, 6:02:12 AM5/22/14
to gan...@googlegroups.com
Hi Brian,

2014-05-15 11:23 GMT+02:00 candlerb <b.ca...@pobox.com>:
What I mean is, instead of creating (say) three logical volumes in the guest, you can create a VM with three disks, which appear as /dev/vda (root), /dev/vdb (for database files), /dev/vdc (for logs). Each of these maps to a logical volume in the host - or DRBD over logical volume - but the guest need not be aware of that.

When you want to grow a volume, then in the host you can do
gnt-instance grow-disk
which grows the logical volume representing one of these virtual disks.

Reboot the guest, do resize2fs inside the guest, and you're done. Make your filesystems directly on the disks (/dev/vdb) not in a partition (/dev/vdb1) to make this easy.

I really like that idea, but that means I cannot use a bootloader and have to load the Kernel from the node (kernel_path/initrd_path)? I really try to avoid that because of the need to keep that kernel in sync with the kerenel modules inside the instance.

Best
Karsten

Roberto Espinoza

unread,
May 22, 2014, 12:04:33 PM5/22/14
to gan...@googlegroups.com

Hello,

Do you have any tips on how to accomplish this for the ganeti scripts¿it seems it airways creates an msdos partition and install the OS there.
I'm using paravirtualized instances so I don't need a boot loader. I tried using "none" as the partition table but didn't work. I may be missing something.

If you could point me where to look, I can take it from there.

Thank you.

On May 15, 2014 11:49 PM, "candlerb" <b.ca...@pobox.com> wrote:

candlerb

unread,
May 22, 2014, 12:50:01 PM5/22/14
to gan...@googlegroups.com
On Thursday, 22 May 2014 16:02:12 UTC+6, Karsten Heymann wrote:

I really like that idea, but that means I cannot use a bootloader and have to load the Kernel from the node (kernel_path/initrd_path)? I really try to avoid that because of the need to keep that kernel in sync with the kerenel modules inside the instance.


I suggest you simply use a partitioned disk for the first disk (with a regular MBR, boot loader etc), and non-partitioned for the subsequent disks.

The first disk could be very small as it only needs your /boot filesystem. The default configuration of Ganeti is to refuse to create volumes smaller than 1G, but this can be adjusted, and then you could create say a 256MB disk to boot from, a second disk for your root filesystem, a third disk for your /var filesystem etc. 

candlerb

unread,
May 22, 2014, 12:54:48 PM5/22/14
to gan...@googlegroups.com
On Thursday, 22 May 2014 22:04:33 UTC+6, rreg wrote:

Hello,

Do you have any tips on how to accomplish this for the ganeti scripts¿it seems it airways creates an msdos partition and install the OS there.

Which ganeti scripts are you talking about?

* ganeti-instance-debootstrap has an option PARTITION_STYLE=none which will install the filesystem directly onto the disk with no partitioning
* ganeti-instance-image always installs either two or three partitions (depending on whether you want swap or not)

But I suspect what you want to do is to install a normal first disk with an msdos partition, and subsequent disks for your data volumes.

if your root filesystem is in the first disk which has an msdos partition table, it *is* still possible to grow it.

1. grow the disk
2. in the guest, go into fdisk
3. make a note of the partition start
4. delete the partition (yikes!)
5. create a new partition with the same start offset, up to end of disk

You will probably now need to reboot the guest to re-read the partition table, since it was already in use. Then use resize2fs to grow the filesystem.

But if you make a separate disk for /var (say), and move all the data off your root partition into the /var disk, then this second disk doesn't need to be partitioned so is easy to manage.

Of course it would be nice to use multiple disks attached *at the time you install the OS*, but I think this will only work if you do a manual install, e.g. from an ISO image, so that you can tell the partitioner how you want it set up.

Roberto Espinoza

unread,
May 22, 2014, 9:23:22 PM5/22/14
to gan...@googlegroups.com
Thank you for your prompt reply.

I can see what you mean, I tried it once but now after going more in depth with this I can see that the partition starts at sector 1 with the disks created by the script, which is normal as it uses sfdisk but if you want to use fdisk, it will start at 2048, I believe that's why I nuked my disks that time while testing.

I will try again and do it with sfdisk so I can specify sector 1, it should work as expected.

Rebooting is not a problem as our services can support that so it may work.

Still, at this point I would like to go with your first suggestion and add a second disk and resize it.

Thank you again.
Reply all
Reply to author
Forward
0 new messages