perfomance issues with 2.12.4 after upgrade from 2.11.6

160 views
Skip to first unread message

Anatoliy Dmytriyev

unread,
Jun 25, 2015, 9:53:07 AM6/25/15
to gan...@googlegroups.com
Hello,

We upgraded our ganeti 2.11.6 to ganeti 2.12.4 because of performance issues with 2.11.6.

However, we still have performance issues with new version also:
1. ‘gnt-instance info’ takes a lot of time: in average 17 sec but up to 100 sec in my tests
2. Every 5 minutes (it looks like */5 in crontab) it is running GROUP_VERIFY_DISKS which blocks other tasks like gnt-instance add/startup/shutdown - tasks get “waiting” status until GROUP_VERIFY_DISKS will be finished. At the same time every 5 minutes in /var/log/ganeti/watcher.log I see the message:
ganeti-watcher pid=42043 ERROR Can't acquire lock on state file /var/lib/ganeti/watcher.cc1ca697-6ef8-478e-97ef-e1ed25ad281b.data: File already locked



The only one modification from a standard setup in our cluster is

gnt-cluster modify --ipolicy-spindle-ratio=128


The example of GROUP_VERIFY_DISKS job:
# gnt-job info 70184
- Job ID: 70184
 
Status: success
 
Received:         2015-06-25 15:30:04.390484
 
Processing start: 2015-06-25 15:30:05.368574 (delta 0.978090s)
 
Processing end:   2015-06-25 15:35:22.434716 (delta 317.066142s)
 
Total processing time: 318.044232 seconds
 
Opcodes:
   
- Opcode: OP_GROUP_VERIFY_DISKS
     
Status: success
     
Processing start: 2015-06-25 15:30:05.368574
     
Execution start:  2015-06-25 15:30:05.897476
     
Processing end:   2015-06-25 15:35:22.434697
     
Input fields:
        OP_ID
: OP_GROUP_VERIFY_DISKS
        comment
: None
        debug_level
: 0
        depends
: None
        group_name
: cc1ca697-6ef8-478e-97ef-e1ed25ad281b
        priority
: 10
        reason
:
         
- - gnt:watcher
           
- Verifying disks of group cc1ca697-6ef8-478e-97ef-e1ed25ad281b
           
- 1435239004291613184
         
- - gnt:opcode:group_verify_disks
           
- job=70184;index=0
           
- 1435239004390484000
         
- - gnt:daemon:masterd:pickup:op_group_verify_disks
           
- job=70184;index=0
           
- 1435239005265840896
     
Result:
       
-
       
-
       
-
     
Execution log:



This is our current setup:

OS: Debian GNU/Linux 7.8 (wheezy) 3.2.0-4-amd64 (all nodes)

# gnt-node list
Node               DTotal  DFree MTotal MNode  MFree Pinst Sinst
node1   2.2T 802.4G 252.4G 52.1G 208.6G    23    24
node2   2.5T   1.1T 126.1G  9.2G 116.7G    24    23
node3   2.5T   1.1T 126.1G 20.5G 105.4G    24    23



# gnt-cluster version
Software version: 2.12.4
Internode protocol: 2120000
Configuration format: 2120000
OS api version
: 20
Export interface: 0
VCS version
: (ganeti) version v2.12.4



# gnt-cluster --version
gnt
-cluster (ganeti v2.12.4) 2.12.4

# gnt-cluster verify
Submitted jobs 70187, 70188
Waiting for job 70187 ...
Thu Jun 25 15:47:59 2015 * Verifying cluster config
Thu Jun 25 15:47:59 2015 * Verifying cluster certificate files
Thu Jun 25 15:47:59 2015 * Verifying hypervisor parameters
Thu Jun 25 15:47:59 2015 * Verifying all nodes belong to an existing group
Waiting for job 70188 ...
Thu Jun 25 15:48:01 2015 * Verifying group 'default'
Thu Jun 25 15:48:01 2015 * Gathering data (3 nodes)
Thu Jun 25 15:48:01 2015 * Gathering information about nodes (3 nodes)
Thu Jun 25 15:48:10 2015 * Gathering disk information (3 nodes)
Thu Jun 25 15:52:00 2015 * Verifying configuration file consistency
Thu Jun 25 15:52:00 2015 * Verifying node status
Thu Jun 25 15:52:00 2015 * Verifying instance status
Thu Jun 25 15:52:00 2015 * Verifying orphan volumes
Thu Jun 25 15:52:00 2015 * Verifying N+1 Memory redundancy
Thu Jun 25 15:52:00 2015 * Other Notes
Thu Jun 25 15:52:00 2015   - NOTICE: 1 non-redundant instance(s) found.
Thu Jun 25 15:52:00 2015 * Hooks Results







Klaus Aehlig

unread,
Jun 29, 2015, 4:47:36 AM6/29/15
to gan...@googlegroups.com

Hello,

> We upgraded our ganeti 2.11.6 to ganeti 2.12.4 because of performance
> issues with 2.11.6.
>
> However, we still have performance issues with new version also:
> 1. ‘gnt-instance info’ takes a lot of time: in average 17 sec but up to 100
> sec in my tests
> 2. Every 5 minutes (it looks like */5 in crontab) it is running
> GROUP_VERIFY_DISKS which blocks other tasks like gnt-instance
> add/startup/shutdown - tasks get “waiting” status until GROUP_VERIFY_DISKS
> will be finished. At the same time every 5 minutes in
> /var/log/ganeti/watcher.log I see the message:
> ganeti-watcher pid=42043 ERROR Can't acquire lock on state file
> /var/lib/ganeti/watcher.cc1ca697-6ef8-478e-97ef-e1ed25ad281b.data: File
> already locked

Yes, we recommend running the watcher every 5 minutes, and most distributions
(including debian) package Ganeti following this recommendation. The watcher
does submit GROUP_VERIFY_DISKS jobs; so the source of those jobs is understood.
Of course, the */5 in the cron job can be replaced by an arbitrary scheduling
that suits you.

Concerning the performance issues, can you please provide a bit more information
about your set up (which hypervisor, which disk templates are you using)? Also,
is there anything unusal about your cluster, like an unusual high number of disks
(given that it is a 3-node cluster, I doubt it, but just want to make sure)? Also,
when gnt-instance info takes a lot of time, is that lock-related (the INSTANCE_QUERY_DATA
job would be in status waiting, and gnt-debug locks would tell you why), CPU-bound
(which process), or neither? Anything unusual in noded's log?

The problem I have is that as long as I cannot reproduce the issue it is hard
for me to find the root cause of the bad performance you're experiencing.

Thanks,
Klaus

--
Klaus Aehlig
Google Germany GmbH, Dienerstr. 12, 80331 Muenchen
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschaeftsfuehrer: Graham Law, Christine Elizabeth Flores

Neal Oakey

unread,
Jun 29, 2015, 5:54:52 AM6/29/15
to gan...@googlegroups.com
Hello,

have you excluded the drbd devices from lvm as recommended in http://docs.ganeti.org/ganeti/master/html/install.html#configuring-lvm
Otherwise it could be that if the GROUP_VERIFY_DISKS calls lvs or similar it will take ages, depending on the number of drbd-devices you have.

Greetings,
Neal

Anatoliy Dmytriyev

unread,
Jun 29, 2015, 7:00:41 AM6/29/15
to gan...@googlegroups.com
Hi Neal,


Yes, I did it. But I notice a delaying for lvs: an lvs command  like
lvs /dev/ganeti/52c6d3df-1e11-4802-a597-cb254de78b30.disk1_data

takes about 2 sec. I think it can at least partly explain our performance problems.
This is our /etc/lvm/lvm.conf file, do you see any wrong settings there?
devices {
    dir
= "/dev"
    scan
= [ "/dev" ]
    obtain_device_list_from_udev
= 1
    preferred_names
= [ ]
    filter
= ["r|/dev/cdrom|", "r|/dev/drbd[0-9]+|" ]
    cache_dir
= "/run/lvm"
    cache_file_prefix
= ""
    write_cache_state
= 1
    sysfs_scan
= 1
    multipath_component_detection
= 1
    md_component_detection
= 1
    md_chunk_alignment
= 1
    data_alignment_detection
= 1
    data_alignment
= 0
    data_alignment_offset_detection
= 1
    ignore_suspended_devices
= 0
    disable_after_error_count
= 0
    require_restorefile_with_uuid
= 1
    pv_min_size
= 2048
    issue_discards
= 0
}


log
{
    verbose
= 0
    syslog
= 1
    overwrite
= 0
    level
= 0
    indent
= 1
    command_names
= 0
    prefix
= "  "
}

backup
{
    backup
= 1
    backup_dir
= "/etc/lvm/backup"
    archive
= 1
    archive_dir
= "/etc/lvm/archive"
    retain_min
= 10
    retain_days
= 30
}

shell
{
    history_size
= 100
}


global {
    umask
= 077
    test
= 0
    units
= "h"
    si_unit_consistency
= 1
    activation
= 1
    proc
= "/proc"
    locking_type
= 1
    wait_for_locks
= 1
    fallback_to_clustered_locking
= 1
    fallback_to_local_locking
= 1
    locking_dir
= "/run/lock/lvm"
    prioritise_write_locks
= 1
    abort_on_internal_errors
= 0
    detect_internal_vg_cache_corruption
= 0
    metadata_read_only
= 0
    mirror_segtype_default
= "mirror"
    use_lvmetad
= 0
}

activation
{
    checks
= 0
    udev_sync
= 1
    udev_rules
= 1
    verify_udev_operations
= 0
    retry_deactivation
= 1
    missing_stripe_filler
= "error"
    use_linear_target
= 1
    reserved_stack
= 64
    reserved_memory
= 8192
    process_priority
= -18
    mirror_region_size
= 512
    readahead
= "auto"
    raid_fault_policy
= "warn"
    mirror_log_fault_policy
= "allocate"
    mirror_image_fault_policy
= "remove"
    snapshot_autoextend_threshold
= 100
    snapshot_autoextend_percent
= 20
    thin_pool_autoextend_threshold
= 100
    thin_pool_autoextend_percent
= 20
    thin_check_executable
= "/sbin/thin_check -q"
    use_mlockall
= 0
    monitoring
= 0
    polling_interval
= 15
}

dmeventd
{
    mirror_library
= "libdevmapper-event-lvm2mirror.so"
    snapshot_library
= "libdevmapper-event-lvm2snapshot.so"
    thin_library
= "libdevmapper-event-lvm2thin.so"
}

...

Anatoliy Dmytriyev

unread,
Jun 29, 2015, 8:40:45 AM6/29/15
to gan...@googlegroups.com
Hi Klaus,

    which hypervisor,


kvm


    which disk templates are you using


drbd


    Also, is there anything unusal about your cluster, like an unusual high number of disks


As I mentioned before, the only one modification from a standard setup in our cluster is

gnt-cluster modify --ipolicy-spindle-ratio=128



number of disks in all instances is 76 (for 71 instances).


when gnt-instance info takes a lot of time, is that lock-related (the INSTANCE_QUERY_DATA job would be in status waiting, and gnt-debug locks would tell you why)

gnt-instance info running immediately without "waiting" status: it immediately gets "running" status but it takes long time.

However, gnt-instance add will be with waiting status:
# gnt-debug locks
Name                                           Mode      Owner       Pending
cluster
/BGL                                    shared    70832,70834 -
cluster
/[lockset]                              -         -           -
instance
/[lockset]                             -         -           -
instance
/xxxxx                                 shared    70832       -
...
instance
/xxxxx                                 shared    70832       -
instance
/new_instance                          exclusive 70834       -
instance
/xxxxx                                 shared    70832       -
node
-alloc/[lockset]                           shared    70832       exclusive:70834
node
/16accc41-71bc-479f-b425-a4ce48f57d6d      shared    70832       -
node
/3406be72-5628-42d5-a0af-92bcf0861033      shared    70832       -
node
/1421562c-6e0a-4194-8827-9c2605c71bb3      shared    70832       -
node
/[lockset]                                 -         -           -
nodegroup
/[lockset]                            -         -           -
nodegroup
/cc1ca697-6ef8-478e-97ef-e1ed25ad281b shared    70832       -


and

# gnt-job list
70832 running GROUP_VERIFY_DISKS(cc1ca697-6ef8-478e-97ef-e1ed25ad281b)
70833 success OS_DIAGNOSE
70834 waiting INSTANCE_CREATE(new_instance)


Anything unusual in noded's log?

Nothing except long (~2 sec) lvs/lvcreate/lvchange commands.

Thanks,
Anatoliy

Klaus Aehlig

unread,
Jun 29, 2015, 9:18:15 AM6/29/15
to gan...@googlegroups.com

Hi Anatoliy,

thanks for all the detailled information. From what I read from
that information, the only thing that is working differently
from how it should is---as you already noticed in a differnt
email---that the lv* commands take longer than they should.

> Nothing except long (~2 sec) lvs/lvcreate/lvchange commands.

At that point, I'm not sure whether there is something in Ganeti
itself to be changed, or somehow you have to change your lv setup.
I'm sorry, that I cannot provide you with a ready-made solution
for lvs tuning.

Regards,

Anatoliy Dmytriyev

unread,
Jun 29, 2015, 10:07:21 AM6/29/15
to gan...@googlegroups.com
To be clear: is it normal when gnt-instance add/startup/shutdown waits until GROUP_VERIFY_DISKS will be finished?

Regards,
Anatoliy

Klaus Aehlig

unread,
Jun 29, 2015, 10:39:16 AM6/29/15
to gan...@googlegroups.com

Hi Anatoliy,

> To be clear: is it normal when gnt-instance add/startup/shutdown waits
> until GROUP_VERIFY_DISKS will be finished?

yes, this is normal. GROUP_VERIFY_DISKS needs a shared lock on all instances
and nodes in that group to ensure that they do not change while verification
is going on. Instance startup/shutdown/... however need an exclusive lock on
the instance. Instance addition needs an exclusive lock on the nodes the
instance is created.

So, that is working as intended. Normally GROUP_VERIFY_DISKS is fast enough
so that the interruption it causes to the cluster is short enough to not
cause a problem if run every 5 minutes by the watcher. In your lvs setup,
however, this unfortunately blocks the whole cluster most of the time.
So it might be worth considering for your setup to reduce the frequency of
the watcher runs (say, only run once an hour); it is also possible to
pause the watcher for a limited amount of time (gnt-cluster watcher pause ...)
to make it not interfere if you submit a lot of changes to your cluster.

candlerb

unread,
Jun 30, 2015, 7:15:51 AM6/30/15
to gan...@googlegroups.com
You could try:

strace -f -tt lvs /dev/ganeti/52c6d3df-1e11-4802-a597-cb254de78b30.disk1_data

and see if that shows you where it is hanging.

Hrvoje Ribicic

unread,
Jun 30, 2015, 7:22:27 AM6/30/15
to gan...@googlegroups.com
Hi Anatoliy,

Setting up a more explicit filter could help. Try following the suggestions from:


especially using -vvvv to figure out what is being scanned / where the slowdown is.

Cheers,
Riba
Hrvoje Ribicic
Ganeti Engineering
Google Germany GmbH
Dienerstr. 12, 80331, München

Geschäftsführer: Graham Law, Christine Elizabeth Flores

Anatoliy Dmytriyev

unread,
Jun 30, 2015, 9:41:36 AM6/30/15
to gan...@googlegroups.com

Hi Riba and candlerb,

Thanks for your help. It seems the performance issue was solved after I used
filter = ["a|^/dev/sd*|", "r|.*|"]


OP_GROUP_VERIFY_DISKS takes now ~12 sec and "gnt-instance info" returns a reslut after ~4 sec.

Thank you all for the help and explanations, I really appreciate the community help and support.

Best regards,
Anatoliy
...
Reply all
Reply to author
Forward
0 new messages