Cluster node cannot perform SST

4,710 views
Skip to first unread message

explos...@gmail.com

unread,
Aug 9, 2016, 11:32:29 AM8/9/16
to Percona Discussion
Running a 2 Node + 1 arbitrator cluster. (Version: 5.6.31-77.0-1.jessie)

Today I restarted one of the nodes and it is refusing to carry out an SST.
The error log shows continual loop of:

2016-08-09 16:23:57 9023 [Warning] WSREP: Member 1.0 (<hostname>) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable

Checking the status on the "good" node shows the following:

mysql> SHOW STATUS LIKE 'wsrep%';
+------------------------------+-------------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+------------------------------+-------------------------------------------------------------------------------------------------------------------------+
| wsrep_local_state_uuid | d01cd98f-2d5e-11e5-9a8a-670acddd40fb |
| wsrep_protocol_version | 7 |
| wsrep_last_committed | 150880588 |
| wsrep_replicated | 11813941 |
| wsrep_replicated_bytes | 10969366436 |
| wsrep_repl_keys | 37295208 |
| wsrep_repl_keys_bytes | 570088902 |
| wsrep_repl_data_bytes | 9643185310 |
| wsrep_repl_other_bytes | 0 |
| wsrep_received | 439 |
| wsrep_received_bytes | 60693 |
| wsrep_local_commits | 11812899 |
| wsrep_local_cert_failures | 4 |
| wsrep_local_replays | 0 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_max | 19 |
| wsrep_local_send_queue_min | 0 |
| wsrep_local_send_queue_avg | 0.019565 |
| wsrep_local_recv_queue | 0 |
| wsrep_local_recv_queue_max | 5 |
| wsrep_local_recv_queue_min | 0 |
| wsrep_local_recv_queue_avg | 0.034169 |
| wsrep_local_cached_downto | 150732901 |
| wsrep_flow_control_paused_ns | 169137648624 |
| wsrep_flow_control_paused | 0.000057 |
| wsrep_flow_control_sent | 0 |
| wsrep_flow_control_recv | 406 |
| wsrep_cert_deps_distance | 7325.477491 |
| wsrep_apply_oooe | 0.020543 |
| wsrep_apply_oool | 0.000002 |
| wsrep_apply_window | 1.021835 |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 1.001283 |
| wsrep_local_state | 3 |
| wsrep_local_state_comment | Joined |
| wsrep_cert_index_size | 109 |
| wsrep_cert_bucket_count | 15184 |
| wsrep_gcache_pool_size | 134219036 |
| wsrep_causal_reads | 0 |
| wsrep_cert_interval | 0.025159 |
| wsrep_incoming_addresses | ,10.1.0.21:3306,10.1.0.20:3306 |
| wsrep_desync_count | 1 |
| wsrep_evs_delayed | 86d7990f-5e37-11e6-b5cb-53680241274d:tcp://10.1.0.21:4567:1,d2791a46-5e3e-11e6-a838-f218e446ecd4:tcp://10.1.0.21:4567:1 |
| wsrep_evs_evict_list | |
| wsrep_evs_repl_latency | 0.000463108/0.000546591/0.000665291/4.48513e-05/49 |
| wsrep_evs_state | OPERATIONAL |
| wsrep_gcomm_uuid | eb9c857f-436f-11e6-8ec3-92420b7dd3d6 |
| wsrep_cluster_conf_id | 43 |
| wsrep_cluster_size | 3 |
| wsrep_cluster_state_uuid | d01cd98f-2d5e-11e5-9a8a-670acddd40fb |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_local_bf_aborts | 0 |
| wsrep_local_index | 2 |
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy <in...@codership.com> |
| wsrep_provider_version | 3.16(r5c765eb) |
| wsrep_ready | ON |
+------------------------------+-------------------------------------------------------------------------------------------------------------------------+
59 rows in set (0.00 sec)


I assume that because wsrep_local_state_comment is "Joined" rather than "Synced" the SST to the other node won't happen. I notice that wsrep_desync_count is 1 which I think might be why the node won't change to "Synced" status. This variable is a very recent addition to percona cluster and I'm not sure how to check what operation is keeping it held at 1 and how to force it back to 0.

Can anybody shed some light on this?

Thanks,
Tom

Vadim Tkachenko

unread,
Aug 9, 2016, 11:42:05 AM8/9/16
to percona-discussion
Tom,

Can you show the full log from the "good" node?
> --
> You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to percona-discuss...@googlegroups.com.
> To post to this group, send email to percona-d...@googlegroups.com.
> Visit this group at https://groups.google.com/group/percona-discussion.
> To view this discussion on the web visit https://groups.google.com/d/msgid/percona-discussion/feba1efa-1300-4f2e-9f72-a9dafdcf5ce9%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Vadim Tkachenko, CTO, Percona
Phone +1-919-249-7483, Skype: vadimtk153

explos...@gmail.com

unread,
Aug 9, 2016, 11:46:29 AM8/9/16
to Percona Discussion
MySQL status on good node:

+-----------------------------------------------+-------------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+-----------------------------------------------+-------------------------------------------------------------------------------------------------------------------------+
| Aborted_clients | 1046 |
| Aborted_connects | 26 |
| Binlog_snapshot_file | |
| Binlog_snapshot_position | 0 |
| Binlog_cache_disk_use | 3836 |
| Binlog_cache_use | 11816265 |
| Binlog_stmt_cache_disk_use | 0 |
| Binlog_stmt_cache_use | 0 |
| Bytes_received | 510 |
| Bytes_sent | 156542 |
| Com_admin_commands | 0 |
| Com_assign_to_keycache | 0 |
| Com_alter_db | 0 |
| Com_alter_db_upgrade | 0 |
| Com_alter_event | 0 |
| Com_alter_function | 0 |
| Com_alter_procedure | 0 |
| Com_alter_server | 0 |
| Com_alter_table | 0 |
| Com_alter_tablespace | 0 |
| Com_alter_user | 0 |
| Com_analyze | 0 |
| Com_begin | 0 |
| Com_binlog | 0 |
| Com_call_procedure | 0 |
| Com_change_db | 0 |
| Com_change_master | 0 |
| Com_check | 0 |
| Com_checksum | 0 |
| Com_commit | 0 |
| Com_create_db | 0 |
| Com_create_event | 0 |
| Com_create_function | 0 |
| Com_create_index | 0 |
| Com_create_procedure | 0 |
| Com_create_server | 0 |
| Com_create_table | 0 |
| Com_create_trigger | 0 |
| Com_create_udf | 0 |
| Com_create_user | 0 |
| Com_create_view | 0 |
| Com_dealloc_sql | 0 |
| Com_delete | 0 |
| Com_delete_multi | 0 |
| Com_do | 0 |
| Com_drop_db | 0 |
| Com_drop_event | 0 |
| Com_drop_function | 0 |
| Com_drop_index | 0 |
| Com_drop_procedure | 0 |
| Com_drop_server | 0 |
| Com_drop_table | 0 |
| Com_drop_trigger | 0 |
| Com_drop_user | 0 |
| Com_drop_view | 0 |
| Com_empty_query | 0 |
| Com_execute_sql | 0 |
| Com_flush | 0 |
| Com_get_diagnostics | 0 |
| Com_grant | 0 |
| Com_ha_close | 0 |
| Com_ha_open | 0 |
| Com_ha_read | 0 |
| Com_help | 0 |
| Com_insert | 0 |
| Com_insert_select | 0 |
| Com_install_plugin | 0 |
| Com_kill | 0 |
| Com_load | 0 |
| Com_lock_tables | 0 |
| Com_lock_tables_for_backup | 0 |
| Com_lock_binlog_for_backup | 0 |
| Com_optimize | 0 |
| Com_preload_keys | 0 |
| Com_prepare_sql | 0 |
| Com_purge | 0 |
| Com_purge_before_date | 0 |
| Com_purge_archived | 0 |
| Com_purge_archived_before_date | 0 |
| Com_release_savepoint | 0 |
| Com_rename_table | 0 |
| Com_rename_user | 0 |
| Com_repair | 0 |
| Com_replace | 0 |
| Com_replace_select | 0 |
| Com_reset | 0 |
| Com_resignal | 0 |
| Com_revoke | 0 |
| Com_revoke_all | 0 |
| Com_rollback | 0 |
| Com_rollback_to_savepoint | 0 |
| Com_savepoint | 0 |
| Com_select | 1 |
| Com_set_option | 0 |
| Com_signal | 0 |
| Com_show_binlog_events | 0 |
| Com_show_binlogs | 0 |
| Com_show_charsets | 0 |
| Com_show_client_statistics | 0 |
| Com_show_collations | 0 |
| Com_show_create_db | 0 |
| Com_show_create_event | 0 |
| Com_show_create_func | 0 |
| Com_show_create_proc | 0 |
| Com_show_create_table | 0 |
| Com_show_create_trigger | 0 |
| Com_show_databases | 1 |
| Com_show_engine_logs | 0 |
| Com_show_engine_mutex | 0 |
| Com_show_engine_status | 0 |
| Com_show_events | 0 |
| Com_show_errors | 0 |
| Com_show_fields | 0 |
| Com_show_function_code | 0 |
| Com_show_function_status | 0 |
| Com_show_grants | 0 |
| Com_show_index_statistics | 0 |
| Com_show_keys | 0 |
| Com_show_master_status | 0 |
| Com_show_open_tables | 0 |
| Com_show_plugins | 0 |
| Com_show_privileges | 0 |
| Com_show_procedure_code | 0 |
| Com_show_procedure_status | 0 |
| Com_show_processlist | 0 |
| Com_show_profile | 0 |
| Com_show_profiles | 0 |
| Com_show_relaylog_events | 0 |
| Com_show_slave_hosts | 0 |
| Com_show_slave_status | 0 |
| Com_show_slave_status_nolock | 0 |
| Com_show_status | 14 |
| Com_show_storage_engines | 0 |
| Com_show_table_statistics | 0 |
| Com_show_table_status | 0 |
| Com_show_tables | 0 |
| Com_show_thread_statistics | 0 |
| Com_show_triggers | 0 |
| Com_show_user_statistics | 0 |
| Com_show_variables | 0 |
| Com_show_warnings | 0 |
| Com_slave_start | 0 |
| Com_slave_stop | 0 |
| Com_stmt_close | 0 |
| Com_stmt_execute | 0 |
| Com_stmt_fetch | 0 |
| Com_stmt_prepare | 0 |
| Com_stmt_reprepare | 0 |
| Com_stmt_reset | 0 |
| Com_stmt_send_long_data | 0 |
| Com_truncate | 0 |
| Com_uninstall_plugin | 0 |
| Com_unlock_binlog | 0 |
| Com_unlock_tables | 0 |
| Com_update | 0 |
| Com_update_multi | 0 |
| Com_xa_commit | 0 |
| Com_xa_end | 0 |
| Com_xa_prepare | 0 |
| Com_xa_recover | 0 |
| Com_xa_rollback | 0 |
| Com_xa_start | 0 |
| Compression | OFF |
| Connection_errors_accept | 0 |
| Connection_errors_internal | 0 |
| Connection_errors_max_connections | 0 |
| Connection_errors_peer_address | 0 |
| Connection_errors_select | 0 |
| Connection_errors_tcpwrap | 0 |
| Connections | 2665441 |
| Created_tmp_disk_tables | 0 |
| Created_tmp_files | 15553 |
| Created_tmp_tables | 1 |
| Delayed_errors | 0 |
| Delayed_insert_threads | 0 |
| Delayed_writes | 0 |
| Flush_commands | 4 |
| Handler_commit | 0 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 0 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 0 |
| Handler_read_last | 0 |
| Handler_read_next | 0 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 29 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 28 |
| Innodb_buffer_pool_dump_status | not started |
| Innodb_buffer_pool_load_status | not started |
| Innodb_background_log_sync | 2947745 |
| Innodb_buffer_pool_pages_data | 435081 |
| Innodb_buffer_pool_bytes_data | 7128367104 |
| Innodb_buffer_pool_pages_dirty | 0 |
| Innodb_buffer_pool_bytes_dirty | 0 |
| Innodb_buffer_pool_pages_flushed | 30352568 |
| Innodb_buffer_pool_pages_LRU_flushed | 0 |
| Innodb_buffer_pool_pages_free | 740057 |
| Innodb_buffer_pool_pages_made_not_young | 18322553 |
| Innodb_buffer_pool_pages_made_young | 3401991 |
| Innodb_buffer_pool_pages_misc | 135574 |
| Innodb_buffer_pool_pages_old | 160444 |
| Innodb_buffer_pool_pages_total | 1310712 |
| Innodb_buffer_pool_read_ahead_rnd | 0 |
| Innodb_buffer_pool_read_ahead | 298097 |
| Innodb_buffer_pool_read_ahead_evicted | 0 |
| Innodb_buffer_pool_read_requests | 10196901049 |
| Innodb_buffer_pool_reads | 77617 |
| Innodb_buffer_pool_wait_free | 0 |
| Innodb_buffer_pool_write_requests | 117477416 |
| Innodb_checkpoint_age | 926 |
| Innodb_checkpoint_max_age | 108005254 |
| Innodb_data_fsyncs | 17076247 |
| Innodb_data_pending_fsyncs | 0 |
| Innodb_data_pending_reads | 0 |
| Innodb_data_pending_writes | 0 |
| Innodb_data_read | 6161256448 |
| Innodb_data_reads | 376663 |
| Innodb_data_writes | 44588509 |
| Innodb_data_written | 1010368811520 |
| Innodb_dblwr_pages_written | 30352568 |
| Innodb_dblwr_writes | 764793 |
| Innodb_deadlocks | 0 |
| Innodb_have_atomic_builtins | ON |
| Innodb_history_list_length | 4365 |
| Innodb_ibuf_discarded_delete_marks | 0 |
| Innodb_ibuf_discarded_deletes | 0 |
| Innodb_ibuf_discarded_inserts | 0 |
| Innodb_ibuf_free_list | 7833 |
| Innodb_ibuf_merged_delete_marks | 815 |
| Innodb_ibuf_merged_deletes | 740 |
| Innodb_ibuf_merged_inserts | 409 |
| Innodb_ibuf_merges | 392 |
| Innodb_ibuf_segment_size | 7835 |
| Innodb_ibuf_size | 1 |
| Innodb_log_waits | 0 |
| Innodb_log_write_requests | 18676107 |
| Innodb_log_writes | 12926889 |
| Innodb_lsn_current | 279656914624 |
| Innodb_lsn_flushed | 279656914624 |
| Innodb_lsn_last_checkpoint | 279656913698 |
| Innodb_master_thread_active_loops | 2454465 |
| Innodb_master_thread_idle_loops | 493817 |
| Innodb_max_trx_id | 1307752282 |
| Innodb_mem_adaptive_hash | 2561243664 |
| Innodb_mem_dictionary | 91131740 |
| Innodb_mem_total | 22355640320 |
| Innodb_mutex_os_waits | 495526 |
| Innodb_mutex_spin_rounds | 21855560 |
| Innodb_mutex_spin_waits | 3757174 |
| Innodb_oldest_view_low_limit_trx_id | 0 |
| Innodb_os_log_fsyncs | 13301951 |
| Innodb_os_log_pending_fsyncs | 0 |
| Innodb_os_log_pending_writes | 0 |
| Innodb_os_log_written | 15584184832 |
| Innodb_page_size | 16384 |
| Innodb_pages_created | 59389 |
| Innodb_pages_read | 376048 |
| Innodb_pages_written | 30352568 |
| Innodb_purge_trx_id | 1307752281 |
| Innodb_purge_undo_no | 0 |
| Innodb_row_lock_current_waits | 0 |
| Innodb_current_row_locks | 0 |
| Innodb_row_lock_time | 13075 |
| Innodb_row_lock_time_avg | 13 |
| Innodb_row_lock_time_max | 1435 |
| Innodb_row_lock_waits | 1005 |
| Innodb_rows_deleted | 391620 |
| Innodb_rows_inserted | 5285969 |
| Innodb_rows_read | 9664521965 |
| Innodb_rows_updated | 6669192 |
| Innodb_num_open_files | 630 |
| Innodb_read_views_memory | 16192 |
| Innodb_descriptors_memory | 8000 |
| Innodb_s_lock_os_waits | 597548 |
| Innodb_s_lock_spin_rounds | 18191425 |
| Innodb_s_lock_spin_waits | 633705 |
| Innodb_truncated_status_writes | 0 |
| Innodb_available_undo_logs | 128 |
| Innodb_x_lock_os_waits | 7291 |
| Innodb_x_lock_spin_rounds | 591869 |
| Innodb_x_lock_spin_waits | 47584 |
| Key_blocks_not_flushed | 0 |
| Key_blocks_unused | 13396 |
| Key_blocks_used | 105 |
| Key_read_requests | 17235757 |
| Key_reads | 4 |
| Key_write_requests | 7735055 |
| Key_writes | 1 |
| Last_query_cost | 10.499000 |
| Last_query_partial_plans | 1 |
| Max_statement_time_exceeded | 0 |
| Max_statement_time_set | 0 |
| Max_statement_time_set_failed | 0 |
| Max_used_connections | 243 |
| Not_flushed_delayed_rows | 0 |
| Open_files | 1 |
| Open_streams | 0 |
| Open_table_definitions | 164 |
| Open_tables | 206 |
| Opened_files | 5987898 |
| Opened_table_definitions | 0 |
| Opened_tables | 0 |
| Performance_schema_accounts_lost | 0 |
| Performance_schema_cond_classes_lost | 0 |
| Performance_schema_cond_instances_lost | 0 |
| Performance_schema_digest_lost | 0 |
| Performance_schema_file_classes_lost | 0 |
| Performance_schema_file_handles_lost | 0 |
| Performance_schema_file_instances_lost | 0 |
| Performance_schema_hosts_lost | 0 |
| Performance_schema_locker_lost | 0 |
| Performance_schema_mutex_classes_lost | 0 |
| Performance_schema_mutex_instances_lost | 0 |
| Performance_schema_rwlock_classes_lost | 0 |
| Performance_schema_rwlock_instances_lost | 0 |
| Performance_schema_session_connect_attrs_lost | 0 |
| Performance_schema_socket_classes_lost | 0 |
| Performance_schema_socket_instances_lost | 0 |
| Performance_schema_stage_classes_lost | 0 |
| Performance_schema_statement_classes_lost | 0 |
| Performance_schema_table_handles_lost | 0 |
| Performance_schema_table_instances_lost | 0 |
| Performance_schema_thread_classes_lost | 0 |
| Performance_schema_thread_instances_lost | 0 |
| Performance_schema_users_lost | 0 |
| Prepared_stmt_count | 0 |
| Qcache_free_blocks | 0 |
| Qcache_free_memory | 0 |
| Qcache_hits | 0 |
| Qcache_inserts | 0 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 0 |
| Qcache_queries_in_cache | 0 |
| Qcache_total_blocks | 0 |
| Queries | 83380419 |
| Questions | 16 |
| Rsa_public_key | |
| Select_full_join | 0 |
| Select_full_range_join | 0 |
| Select_range | 0 |
| Select_range_check | 0 |
| Select_scan | 1 |
| Slave_heartbeat_period | |
| Slave_last_heartbeat | |
| Slave_open_temp_tables | 0 |
| Slave_received_heartbeats | |
| Slave_retried_transactions | |
| Slave_running | OFF |
| Slow_launch_threads | 0 |
| Slow_queries | 0 |
| Sort_merge_passes | 0 |
| Sort_range | 0 |
| Sort_rows | 0 |
| Sort_scan | 0 |
| Ssl_accept_renegotiates | 0 |
| Ssl_accepts | 0 |
| Ssl_callback_cache_hits | 0 |
| Ssl_cipher | |
| Ssl_cipher_list | |
| Ssl_client_connects | 0 |
| Ssl_connect_renegotiates | 0 |
| Ssl_ctx_verify_depth | 0 |
| Ssl_ctx_verify_mode | 0 |
| Ssl_default_timeout | 0 |
| Ssl_finished_accepts | 0 |
| Ssl_finished_connects | 0 |
| Ssl_server_not_after | |
| Ssl_server_not_before | |
| Ssl_session_cache_hits | 0 |
| Ssl_session_cache_misses | 0 |
| Ssl_session_cache_mode | NONE |
| Ssl_session_cache_overflows | 0 |
| Ssl_session_cache_size | 0 |
| Ssl_session_cache_timeouts | 0 |
| Ssl_sessions_reused | 0 |
| Ssl_used_session_cache_entries | 0 |
| Ssl_verify_depth | 0 |
| Ssl_verify_mode | 0 |
| Ssl_version | |
| Table_locks_immediate | 50111288 |
| Table_locks_waited | 0 |
| Table_open_cache_hits | 0 |
| Table_open_cache_misses | 0 |
| Table_open_cache_overflows | 0 |
| Tc_log_max_pages_used | 0 |
| Tc_log_page_size | 0 |
| Tc_log_page_waits | 0 |
| Threadpool_idle_threads | 0 |
| Threadpool_threads | 0 |
| Threads_cached | 20 |
| Threads_connected | 158 |
| Threads_created | 350 |
| Threads_running | 1 |
| Uptime | 2951613 |
| Uptime_since_flush_status | 2951613 |
| wsrep_local_state_uuid | d01cd98f-2d5e-11e5-9a8a-670acddd40fb |
| wsrep_protocol_version | 7 |
| wsrep_last_committed | 150883950 |
| wsrep_replicated | 11817303 |
| wsrep_replicated_bytes | 10972083910 |
| wsrep_repl_keys | 37305743 |
| wsrep_repl_keys_bytes | 570250508 |
| wsrep_repl_data_bytes | 9645526010 |
| wsrep_repl_other_bytes | 0 |
| wsrep_received | 492 |
| wsrep_received_bytes | 61117 |
| wsrep_local_commits | 11816261 |


| wsrep_local_cert_failures | 4 |
| wsrep_local_replays | 0 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_max | 19 |
| wsrep_local_send_queue_min | 0 |

| wsrep_local_send_queue_avg | 0.019577 |


| wsrep_local_recv_queue | 0 |
| wsrep_local_recv_queue_max | 5 |
| wsrep_local_recv_queue_min | 0 |

| wsrep_local_recv_queue_avg | 0.030488 |
| wsrep_local_cached_downto | 150735904 |


| wsrep_flow_control_paused_ns | 169137648624 |
| wsrep_flow_control_paused | 0.000057 |
| wsrep_flow_control_sent | 0 |
| wsrep_flow_control_recv | 406 |

| wsrep_cert_deps_distance | 7323.414715 |
| wsrep_apply_oooe | 0.020553 |
| wsrep_apply_oool | 0.000002 |
| wsrep_apply_window | 1.021846 |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 1.001284 |
| wsrep_local_state | 3 |
| wsrep_local_state_comment | Joined |
| wsrep_cert_index_size | 129 |


| wsrep_cert_bucket_count | 15184 |
| wsrep_gcache_pool_size | 134219036 |
| wsrep_causal_reads | 0 |

| wsrep_cert_interval | 0.025170 |


| wsrep_incoming_addresses | ,10.1.0.21:3306,10.1.0.20:3306 |
| wsrep_desync_count | 1 |
| wsrep_evs_delayed | 86d7990f-5e37-11e6-b5cb-53680241274d:tcp://10.1.0.21:4567:1,d2791a46-5e3e-11e6-a838-f218e446ecd4:tcp://10.1.0.21:4567:1 |
| wsrep_evs_evict_list | |

| wsrep_evs_repl_latency | 0.000394112/0.000522934/0.000722915/4.11242e-05/174 |


| wsrep_evs_state | OPERATIONAL |
| wsrep_gcomm_uuid | eb9c857f-436f-11e6-8ec3-92420b7dd3d6 |
| wsrep_cluster_conf_id | 43 |
| wsrep_cluster_size | 3 |
| wsrep_cluster_state_uuid | d01cd98f-2d5e-11e5-9a8a-670acddd40fb |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_local_bf_aborts | 0 |
| wsrep_local_index | 2 |
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy <in...@codership.com> |
| wsrep_provider_version | 3.16(r5c765eb) |
| wsrep_ready | ON |

+-----------------------------------------------+-------------------------------------------------------------------------------------------------------------------------+
462 rows in set (0.01 sec)

Vadim Tkachenko

unread,
Aug 9, 2016, 11:52:23 AM8/9/16
to Percona Discussion, explos...@gmail.com
Tom,

Not the mysql status but full error.log content.

Thanks,
Vadim

> > To unsubscribe from this group and stop receiving emails from it, send an email to percona-discussion+unsub...@googlegroups.com.
> > To post to this group, send email to percona-discussion@googlegroups.com.

explos...@gmail.com

unread,
Aug 9, 2016, 12:00:29 PM8/9/16
to Percona Discussion, explos...@gmail.com
My apologies.

Error log from good node:

2016-08-09 14:35:42 19620 [Note] WSREP: declaring 3adeb376 at tcp://10.1.0.22:4567 stable
2016-08-09 14:35:42 19620 [Note] WSREP: forgetting fa95df4c (tcp://10.1.0.21:4567)
2016-08-09 14:35:42 19620 [Note] WSREP: Node 3adeb376 state prim
2016-08-09 14:35:42 19620 [Note] WSREP: view(view_id(PRIM,3adeb376,30) memb {
3adeb376,0
eb9c857f,0
} joined {
} left {
} partitioned {
fa95df4c,0
})
2016-08-09 14:35:42 19620 [Note] WSREP: save pc into disk
2016-08-09 14:35:42 19620 [Note] WSREP: forgetting fa95df4c (tcp://10.1.0.21:4567)
2016-08-09 14:35:42 19620 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
2016-08-09 14:35:42 19620 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2016-08-09 14:35:42 19620 [Note] WSREP: STATE EXCHANGE: sent state msg: 2b47095e-5e36-11e6-8602-26724c5f0983
2016-08-09 14:35:42 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 2b47095e-5e36-11e6-8602-26724c5f0983 from 0 (garb)
2016-08-09 14:35:42 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 2b47095e-5e36-11e6-8602-26724c5f0983 from 1 (mysql1.i.srv.lan.switchsys.com)
2016-08-09 14:35:42 19620 [Note] WSREP: 'garb' demoted SYNCED->PRIMARY due to gap in history: 139066647 - 150854812
2016-08-09 14:35:42 19620 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 27,
members = 1/2 (joined/total),
act_id = 150854812,
last_appl. = 0,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = d01cd98f-2d5e-11e5-9a8a-670acddd40fb
2016-08-09 14:35:42 19620 [Note] WSREP: Flow-control interval: [23, 23]
2016-08-09 14:35:42 19620 [Note] WSREP: New cluster view: global state: d01cd98f-2d5e-11e5-9a8a-670acddd40fb:150854812, view# 28: Primary, number of nodes: 2, my index: 1, protocol version 3
2016-08-09 14:35:42 19620 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-08-09 14:35:42 19620 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-08-09 14:35:42 19620 [Note] WSREP: Member 0.0 (garb) requested state transfer from '*any*'. Selected 1.0 (mysql1.i.srv.lan.switchsys.com)(SYNCED) as donor.
2016-08-09 14:35:42 19620 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 150854812)
2016-08-09 14:35:42 19620 [Note] WSREP: 0.0 (garb): State transfer from 1.0 (mysql1.i.srv.lan.switchsys.com) complete.
2016-08-09 14:35:42 19620 [Note] WSREP: Member 0.0 (garb) synced with group.
2016-08-09 14:35:42 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:35:42 19620 [Note] WSREP: Assign initial position for certification: 150854812, protocol version: 3
2016-08-09 14:35:42 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:35:42 19620 [Note] WSREP: 1.0 (mysql1.i.srv.lan.switchsys.com): State transfer to 0.0 (garb) complete.
2016-08-09 14:35:42 19620 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 150854812)
2016-08-09 14:35:42 19620 [Note] WSREP: Member 1.0 (mysql1.i.srv.lan.switchsys.com) synced with group.
2016-08-09 14:35:42 19620 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 150854812)
2016-08-09 14:35:42 19620 [Note] WSREP: Synchronized with group, ready for connections
2016-08-09 14:35:48 19620 [Note] WSREP: cleaning up fa95df4c (tcp://10.1.0.21:4567)
2016-08-09 14:35:50 19620 [Note] WSREP: (eb9c857f, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2016-08-09 14:35:50 19620 [Note] WSREP: declaring 2f9e7283 at tcp://10.1.0.21:4567 stable
2016-08-09 14:35:50 19620 [Note] WSREP: declaring 3adeb376 at tcp://10.1.0.22:4567 stable
2016-08-09 14:35:50 19620 [Note] WSREP: Node 3adeb376 state prim
2016-08-09 14:35:50 19620 [Note] WSREP: view(view_id(PRIM,2f9e7283,31) memb {
2f9e7283,0
3adeb376,0
eb9c857f,0
} joined {
} left {
} partitioned {
})
2016-08-09 14:35:50 19620 [Note] WSREP: save pc into disk
2016-08-09 14:35:50 19620 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
2016-08-09 14:35:50 19620 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2016-08-09 14:35:51 19620 [Note] WSREP: STATE EXCHANGE: sent state msg: 30379689-5e36-11e6-98d8-7e21ceafbdfd
2016-08-09 14:35:51 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 30379689-5e36-11e6-98d8-7e21ceafbdfd from 0 (mysql2.i.srv.lan.switchsys.com)
2016-08-09 14:35:51 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 30379689-5e36-11e6-98d8-7e21ceafbdfd from 1 (garb)
2016-08-09 14:35:51 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 30379689-5e36-11e6-98d8-7e21ceafbdfd from 2 (mysql1.i.srv.lan.switchsys.com)
2016-08-09 14:35:51 19620 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 28,
members = 2/3 (joined/total),
act_id = 150854825,
last_appl. = 0,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = d01cd98f-2d5e-11e5-9a8a-670acddd40fb
2016-08-09 14:35:51 19620 [Note] WSREP: Flow-control interval: [28, 28]
2016-08-09 14:35:51 19620 [Note] WSREP: New cluster view: global state: d01cd98f-2d5e-11e5-9a8a-670acddd40fb:150854825, view# 29: Primary, number of nodes: 3, my index: 2, protocol version 3
2016-08-09 14:35:51 19620 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-08-09 14:35:51 19620 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-08-09 14:35:51 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:35:51 19620 [Note] WSREP: Assign initial position for certification: 150854825, protocol version: 3
2016-08-09 14:35:51 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:35:51 19620 [Note] WSREP: Member 0.0 (mysql2.i.srv.lan.switchsys.com) requested state transfer from '*any*'. Selected 2.0 (mysql1.i.srv.lan.switchsys.com)(SYNCED) as donor.
2016-08-09 14:35:51 19620 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 150854826)
2016-08-09 14:35:51 19620 [Note] WSREP: IST request: d01cd98f-2d5e-11e5-9a8a-670acddd40fb:150854812-150854825|tcp://10.1.0.21:4568
2016-08-09 14:35:51 19620 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-08-09 14:35:51 19620 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.1.0.21:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' '' --gtid 'd01cd98f-2d5e-11e5-9a8a-670acddd40fb:150854812' --bypass'
2016-08-09 14:35:51 19620 [Note] WSREP: sst_donor_thread signaled with 0
2016-08-09 14:35:51 19620 [Note] WSREP: async IST sender starting to serve tcp://10.1.0.21:4568 sending 150854813-150854825
WSREP_SST: [INFO] Streaming with xbstream (20160809 14:35:51.914)
WSREP_SST: [INFO] Using socat as streamer (20160809 14:35:51.917)
WSREP_SST: [INFO] Bypassing the SST for IST (20160809 14:35:51.922)
WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} ${IST_FILE} | socat -u stdio TCP:10.1.0.21:4444; RC=( ${PIPESTATUS[@]} ) (20160809 14:35:51.928)
2016-08-09 14:35:51 19620 [Note] WSREP: 2.0 (mysql1.i.srv.lan.switchsys.com): State transfer to 0.0 (mysql2.i.srv.lan.switchsys.com) complete.
2016-08-09 14:35:51 19620 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 150854827)
2016-08-09 14:35:51 19620 [Note] WSREP: Member 2.0 (mysql1.i.srv.lan.switchsys.com) synced with group.
2016-08-09 14:35:51 19620 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 150854827)
2016-08-09 14:35:51 19620 [Note] WSREP: Synchronized with group, ready for connections
2016-08-09 14:35:51 19620 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
WSREP_SST: [INFO] Total time on donor: 0 seconds (20160809 14:35:51.933)
WSREP_SST: [INFO] Cleaning up temporary directories (20160809 14:35:51.937)
2016-08-09 14:35:53 19620 [Note] WSREP: (eb9c857f, 'tcp://0.0.0.0:4567') turning message relay requesting off
2016-08-09 14:35:53 19620 [Note] WSREP: async IST sender served
2016-08-09 14:35:53 19620 [Note] WSREP: declaring 3adeb376 at tcp://10.1.0.22:4567 stable
2016-08-09 14:35:53 19620 [Note] WSREP: forgetting 2f9e7283 (tcp://10.1.0.21:4567)
2016-08-09 14:35:53 19620 [Note] WSREP: Node 3adeb376 state prim
2016-08-09 14:35:53 19620 [Note] WSREP: view(view_id(PRIM,3adeb376,32) memb {
3adeb376,0
eb9c857f,0
} joined {
} left {
} partitioned {
2f9e7283,0
})
2016-08-09 14:35:53 19620 [Note] WSREP: save pc into disk
2016-08-09 14:35:53 19620 [Note] WSREP: forgetting 2f9e7283 (tcp://10.1.0.21:4567)
2016-08-09 14:35:53 19620 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
2016-08-09 14:35:53 19620 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2016-08-09 14:35:53 19620 [Note] WSREP: STATE EXCHANGE: sent state msg: 31d30173-5e36-11e6-b44d-9ea8bf24aaa9
2016-08-09 14:35:53 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 31d30173-5e36-11e6-b44d-9ea8bf24aaa9 from 0 (garb)
2016-08-09 14:35:53 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 31d30173-5e36-11e6-b44d-9ea8bf24aaa9 from 1 (mysql1.i.srv.lan.switchsys.com)
2016-08-09 14:35:53 19620 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 29,
members = 2/2 (joined/total),
act_id = 150854830,
last_appl. = 0,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = d01cd98f-2d5e-11e5-9a8a-670acddd40fb
2016-08-09 14:35:53 19620 [Note] WSREP: Flow-control interval: [23, 23]
2016-08-09 14:35:53 19620 [Note] WSREP: New cluster view: global state: d01cd98f-2d5e-11e5-9a8a-670acddd40fb:150854830, view# 30: Primary, number of nodes: 2, my index: 1, protocol version 3
2016-08-09 14:35:53 19620 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-08-09 14:35:53 19620 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-08-09 14:35:53 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:35:53 19620 [Note] WSREP: Assign initial position for certification: 150854830, protocol version: 3
2016-08-09 14:35:53 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:35:59 19620 [Note] WSREP: cleaning up 2f9e7283 (tcp://10.1.0.21:4567)
2016-08-09 14:38:13 19620 [Note] WSREP: (eb9c857f, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.1.0.21:4567
2016-08-09 14:38:13 19620 [Note] WSREP: declaring 3adeb376 at tcp://10.1.0.22:4567 stable
2016-08-09 14:38:13 19620 [Note] WSREP: declaring 850f82c0 at tcp://10.1.0.21:4567 stable
2016-08-09 14:38:13 19620 [Note] WSREP: Node 3adeb376 state prim
2016-08-09 14:38:13 19620 [Note] WSREP: view(view_id(PRIM,3adeb376,33) memb {
3adeb376,0
850f82c0,0
eb9c857f,0
} joined {
} left {
} partitioned {
})
2016-08-09 14:38:13 19620 [Note] WSREP: save pc into disk
2016-08-09 14:38:13 19620 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
2016-08-09 14:38:13 19620 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2016-08-09 14:38:13 19620 [Note] WSREP: STATE EXCHANGE: sent state msg: 85469e78-5e36-11e6-b66d-3f86037fa017
2016-08-09 14:38:13 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 85469e78-5e36-11e6-b66d-3f86037fa017 from 0 (garb)
2016-08-09 14:38:13 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 85469e78-5e36-11e6-b66d-3f86037fa017 from 2 (mysql1.i.srv.lan.switchsys.com)
2016-08-09 14:38:14 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 85469e78-5e36-11e6-b66d-3f86037fa017 from 1 (mysql2.i.srv.lan.switchsys.com)
2016-08-09 14:38:14 19620 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 30,
members = 2/3 (joined/total),
act_id = 150855269,
last_appl. = 150855168,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = d01cd98f-2d5e-11e5-9a8a-670acddd40fb
2016-08-09 14:38:14 19620 [Note] WSREP: Flow-control interval: [28, 28]
2016-08-09 14:38:14 19620 [Note] WSREP: New cluster view: global state: d01cd98f-2d5e-11e5-9a8a-670acddd40fb:150855269, view# 31: Primary, number of nodes: 3, my index: 2, protocol version 3
2016-08-09 14:38:14 19620 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-08-09 14:38:14 19620 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-08-09 14:38:14 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:38:14 19620 [Note] WSREP: Assign initial position for certification: 150855269, protocol version: 3
2016-08-09 14:38:14 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:38:14 19620 [Note] WSREP: Member 1.0 (mysql2.i.srv.lan.switchsys.com) requested state transfer from '*any*'. Selected 2.0 (mysql1.i.srv.lan.switchsys.com)(SYNCED) as donor.
2016-08-09 14:38:14 19620 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 150855272)
2016-08-09 14:38:14 19620 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-08-09 14:38:14 19620 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.1.0.21:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' '' --gtid 'd01cd98f-2d5e-11e5-9a8a-670acddd40fb:150855272''
2016-08-09 14:38:14 19620 [Note] WSREP: sst_donor_thread signaled with 0
WSREP_SST: [INFO] Streaming with xbstream (20160809 14:38:14.650)
WSREP_SST: [INFO] Using socat as streamer (20160809 14:38:14.652)
WSREP_SST: [INFO] Using /tmp/tmp.GjhDXUHYNy as innobackupex temporary directory (20160809 14:38:14.663)
WSREP_SST: [INFO] Streaming GTID file before SST (20160809 14:38:14.668)
WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} | socat -u stdio TCP:10.1.0.21:4444; RC=( ${PIPESTATUS[@]} ) (20160809 14:38:14.671)
WSREP_SST: [INFO] Sleeping before data transfer for SST (20160809 14:38:14.677)
2016-08-09 14:38:16 19620 [Note] WSREP: (eb9c857f, 'tcp://0.0.0.0:4567') turning message relay requesting off
WSREP_SST: [INFO] Streaming the backup to joiner at 10.1.0.21 4444 (20160809 14:38:24.681)
WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/etc/mysql/my.cnf --defaults-group=mysqld --no-version-check --no-backup-locks $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | socat -u stdio TCP:10.1.0.21:4444; RC=( ${PIPESTATUS[@]} ) (20160809 14:38:24.685)
2016-08-09 14:44:34 19620 [Note] WSREP: Provider paused at d01cd98f-2d5e-11e5-9a8a-670acddd40fb:150856622 (11790030)
2016-08-09 14:44:34 19620 [Note] WSREP: resuming provider at 11790030
2016-08-09 14:44:34 19620 [Note] WSREP: Provider resumed.
2016-08-09 14:44:34 19620 [Note] WSREP: 2.0 (mysql1.i.srv.lan.switchsys.com): State transfer to 1.0 (mysql2.i.srv.lan.switchsys.com) complete.
2016-08-09 14:44:34 19620 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 150856622)
2016-08-09 14:44:34 19620 [Note] WSREP: SYNC message ignored as node 2.0 (mysql1.i.srv.lan.switchsys.com) was re-transitioned to DONOR mode before it synced.
2016-08-09 14:44:34 19620 [ERROR] WSREP: sst sent called when not SST donor, state JOINED
WSREP_SST: [INFO] Total time on donor: 0 seconds (20160809 14:44:34.443)
WSREP_SST: [INFO] Cleaning up temporary directories (20160809 14:44:34.449)
2016-08-09 14:44:41 19620 [Note] WSREP: declaring 3adeb376 at tcp://10.1.0.22:4567 stable
2016-08-09 14:44:41 19620 [Note] WSREP: forgetting 850f82c0 (tcp://10.1.0.21:4567)
2016-08-09 14:44:41 19620 [Note] WSREP: Node 3adeb376 state prim
2016-08-09 14:44:41 19620 [Note] WSREP: view(view_id(PRIM,3adeb376,34) memb {
3adeb376,0
eb9c857f,0
} joined {
} left {
} partitioned {
850f82c0,0
})
2016-08-09 14:44:41 19620 [Note] WSREP: save pc into disk
2016-08-09 14:44:41 19620 [Note] WSREP: forgetting 850f82c0 (tcp://10.1.0.21:4567)
2016-08-09 14:44:41 19620 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
2016-08-09 14:44:41 19620 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2016-08-09 14:44:41 19620 [Note] WSREP: STATE EXCHANGE: sent state msg: 6c204d25-5e37-11e6-a152-7bc3b4289617
2016-08-09 14:44:41 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 6c204d25-5e37-11e6-a152-7bc3b4289617 from 0 (garb)
2016-08-09 14:44:41 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 6c204d25-5e37-11e6-a152-7bc3b4289617 from 1 (mysql1.i.srv.lan.switchsys.com)
2016-08-09 14:44:41 19620 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 31,
members = 2/2 (joined/total),
act_id = 150856645,
last_appl. = 150856533,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = d01cd98f-2d5e-11e5-9a8a-670acddd40fb
2016-08-09 14:44:41 19620 [Note] WSREP: Flow-control interval: [23, 23]
2016-08-09 14:44:41 19620 [Note] WSREP: Restored state JOINED -> DONOR/DESYNCED (150856645)
2016-08-09 14:44:41 19620 [Note] WSREP: New cluster view: global state: d01cd98f-2d5e-11e5-9a8a-670acddd40fb:150856645, view# 32: Primary, number of nodes: 2, my index: 1, protocol version 3
2016-08-09 14:44:41 19620 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-08-09 14:44:41 19620 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-08-09 14:44:41 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:44:41 19620 [Note] WSREP: Assign initial position for certification: 150856645, protocol version: 3
2016-08-09 14:44:41 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:44:46 19620 [Note] WSREP: cleaning up 850f82c0 (tcp://10.1.0.21:4567)
2016-08-09 14:45:25 19620 [Note] WSREP: (eb9c857f, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.1.0.21:4567
2016-08-09 14:45:26 19620 [Note] WSREP: declaring 3adeb376 at tcp://10.1.0.22:4567 stable
2016-08-09 14:45:26 19620 [Note] WSREP: declaring 86d7990f at tcp://10.1.0.21:4567 stable
2016-08-09 14:45:26 19620 [Note] WSREP: Node 3adeb376 state prim
2016-08-09 14:45:26 19620 [Note] WSREP: view(view_id(PRIM,3adeb376,35) memb {
3adeb376,0
86d7990f,0
eb9c857f,0
} joined {
} left {
} partitioned {
})
2016-08-09 14:45:26 19620 [Note] WSREP: save pc into disk
2016-08-09 14:45:26 19620 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
2016-08-09 14:45:26 19620 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2016-08-09 14:45:26 19620 [Note] WSREP: STATE EXCHANGE: sent state msg: 86f35cdf-5e37-11e6-8293-72ab569522d1
2016-08-09 14:45:26 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 86f35cdf-5e37-11e6-8293-72ab569522d1 from 0 (garb)
2016-08-09 14:45:26 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 86f35cdf-5e37-11e6-8293-72ab569522d1 from 2 (mysql1.i.srv.lan.switchsys.com)
2016-08-09 14:45:26 19620 [Note] WSREP: STATE EXCHANGE: got state msg: 86f35cdf-5e37-11e6-8293-72ab569522d1 from 1 (mysql2.i.srv.lan.switchsys.com)
2016-08-09 14:45:26 19620 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 32,
members = 2/3 (joined/total),
act_id = 150856947,
last_appl. = 150856832,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = d01cd98f-2d5e-11e5-9a8a-670acddd40fb
2016-08-09 14:45:26 19620 [Note] WSREP: Flow-control interval: [28, 28]
2016-08-09 14:45:26 19620 [Note] WSREP: New cluster view: global state: d01cd98f-2d5e-11e5-9a8a-670acddd40fb:150856947, view# 33: Primary, number of nodes: 3, my index: 2, protocol version 3
2016-08-09 14:45:26 19620 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-08-09 14:45:26 19620 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-08-09 14:45:26 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:45:26 19620 [Note] WSREP: Assign initial position for certification: 150856947, protocol version: 3
2016-08-09 14:45:26 19620 [Note] WSREP: Service thread queue flushed.
2016-08-09 14:45:26 19620 [Warning] WSREP: Member 1.0 (mysql2.i.srv.lan.switchsys.com) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable
2016-08-09 14:45:27 19620 [Warning] WSREP: Member 1.0 (mysql2.i.srv.lan.switchsys.com) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable
2016-08-09 14:45:28 19620 [Warning] WSREP: Member 1.0 (mysql2.i.srv.lan.switchsys.com) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable
2016-08-09 14:45:29 19620 [Note] WSREP: (eb9c857f, 'tcp://0.0.0.0:4567') turning message relay requesting off
2016-08-09 14:45:29 19620 [Warning] WSREP: Member 1.0 (mysql2.i.srv.lan.switchsys.com) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable
2016-08-09 14:45:30 19620 [Warning] WSREP: Member 1.0 (mysql2.i.srv.lan.switchsys.com) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable
2016-08-09 14:45:31 19620 [Warning] WSREP: Member 1.0 (mysql2.i.srv.lan.switchsys.com) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable
2016-08-09 14:45:32 19620 [Warning] WSREP: Member 1.0 (mysql2.i.srv.lan.switchsys.com) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable
2016-08-09 14:45:33 19620 [Warning] WSREP: Member 1.0 (mysql2.i.srv.lan.switchsys.com) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable

The last message is repeated hundreds of times.



On Tuesday, 9 August 2016 16:52:23 UTC+1, Vadim Tkachenko wrote:
> Tom,
>
>
> | Innodb_deadlocks   ...

Vadim Tkachenko

unread,
Aug 9, 2016, 12:14:54 PM8/9/16
to percona-discussion, explos...@gmail.com
Tom,

Thanks,
I also wonder where did you get your version you use?

The latest on our site
https://www.percona.com/downloads/Percona-XtraDB-Cluster-56/LATEST/
is 5.6.30, not 5.6.31
> --
> You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to percona-discuss...@googlegroups.com.
> To post to this group, send email to percona-d...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/percona-discussion/30d05f7b-18c6-4590-a272-c42183788824%40googlegroups.com.

explos...@gmail.com

unread,
Aug 9, 2016, 3:27:49 PM8/9/16
to Percona Discussion, explos...@gmail.com
Hi Vadim,

We use the official Percona APT repo!
We have never installed anything from any other source.
Tom

Vadim Tkachenko

unread,
Aug 9, 2016, 11:31:05 PM8/9/16
to percona-discussion, Tom Ford

krunal....@percona.com

unread,
Aug 9, 2016, 11:37:38 PM8/9/16
to Percona Discussion, explos...@gmail.com
Hi Tom,

Did you had any active workload that can put the node into desync state ?

Regards,
Krunal

> > To unsubscribe from this group and stop receiving emails from it, send an email to percona-discussion+unsub...@googlegroups.com.

evgeniy...@percona.com

unread,
Aug 10, 2016, 1:18:57 AM8/10/16
to Percona Discussion, explos...@gmail.com
Hi.

> We use the official Percona APT repo!
> We have never installed anything from any other source.

I have checked repos directly and the latest PXC-56 version is 5.6.30-25.16-1.
Also once installing PXC-56 on jessie 5.6.30-25.16-1.jessie was installed.

explos...@gmail.com

unread,
Aug 10, 2016, 3:43:43 AM8/10/16
to Percona Discussion, explos...@gmail.com, evgeniy...@percona.com
Sorry I'm not sure where I read that version number from. These are my installed packages:

ii percona-xtradb-cluster-56 5.6.30-25.16-1.jessie amd64 Percona XtraDB Cluster with Galera
ii percona-xtradb-cluster-client-5.6 5.6.30-25.16-1.jessie amd64 Percona XtraDB Cluster database client binaries
ii percona-xtradb-cluster-common-5.6 5.6.30-25.16-1.jessie amd64 Percona XtraDB Cluster database common files (e.g. /etc/mysql/my.cnf)
ii percona-xtradb-cluster-galera-3 3.16-1.jessie amd64 Metapackage for latest version of galera3.
ii percona-xtradb-cluster-galera-3.x 3.16-1.jessie amd64 Galera components of Percona XtraDB Cluster
ii percona-xtradb-cluster-server-5.6 5.6.30-25.16-1.jessie amd64 Percona XtraDB Cluster database server binaries


Krunal - There are a number of applications that use this cluster and I cannot be sure what might have put it in a desynced state. The desynced node is still working, it is just refusing to act as a donor for our offline node. I don't understand how it can be desynced when it is only itself and an arbitrator running?

Tom

explos...@gmail.com

unread,
Aug 10, 2016, 4:42:18 AM8/10/16
to Percona Discussion, explos...@gmail.com, evgeniy...@percona.com
Also just to add that the process list is showing 2 long running processes:

mysql> show processlist;
+---------+-------------+------------------+------------+---------+---------+---------------------+------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+---------+-------------+------------------+------------+---------+---------+---------------------+------------------+-----------+---------------+
| 1 | system user | | NULL | Sleep | 2234276 | committed 142118102 | NULL | 0 | 0 |
| 2 | system user | | NULL | Sleep | 3012482 | wsrep aborter idle | NULL | 0 | 0 |

many normal query processes follow these 2.

Process id 1 looks a bit suspicious as "committed 142118102" never seems to change.

Tom

krunal....@percona.com

unread,
Aug 11, 2016, 3:56:13 AM8/11/16
to Percona Discussion, explos...@gmail.com, evgeniy...@percona.com
Hi Tom,

From the log we discovered that backup locks use is suppressed causing XB to use FLUSH TABLE WITH READ LOCK.
We have found an issue where-in SST and FLUSH TABLE WITH READ LOCK are conflicting due to updated FTWRL semantics. We are working on this issue but parallely you can simply enable use of backup lock and things will work as expected (w/o error)

------------
WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/etc/mysql/my.cnf  --defaults-group=mysqld --no-version-check --no-backup-locks $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | socat -u stdio TCP:10.1.0.21:4444; RC=( ${PIPESTATUS[@]} ) (20160809 14:38:24.685) 
------------

--no-backup-locks limitation was added to PXC few years back when backup locks had issues but they are stable now and default way of using XB too.

You would have either of these 2 settings (FORCE_FTWRL or [sst] section inno-backup-opts..) to suppress use of backup-locks. You can comment it for now.

<snippet of limitation which is no more valid with newer version of PXC-5.6>
Backup locks used during SST or with Xtrabackup can crash, on donor, either use inno-backup-opts=’–no-backup-locks’ under [sst] in my.cnf or set FORCE_FTWRL=1 in /etc/sysconfig/mysql (or /etc/sysconfig/mysql.%i for corresponding unit/service) for CentOS/RHEL or /etc/default/mysql in debian/ubuntu. You can also use rsync as the alternate SST method if those don’t work.
</snippet>

Regards,
Krunal

explos...@gmail.com

unread,
Aug 11, 2016, 4:21:44 AM8/11/16
to Percona Discussion, explos...@gmail.com, evgeniy...@percona.com
Hi Krunal,

Thank you for this information. I have commented out the --no-backup-locks options in the my.cnf of the broken node but it still gets the error in the logs when I try and start MySQL:

2016-08-11 09:19:34 28473 [Warning] WSREP: Member 1.0 (mysql2.i.srv.lan.switchsys.com) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable

This is because the working node is still not in a synced state. How can I force the working node back to synced so that it can act as a donor again?

Tom

Vadim Tkachenko

unread,
Aug 11, 2016, 8:50:50 PM8/11/16
to percona-discussion, Tom Ford, Evgeniy Patlan
Tom,

I am afraid you need to get donor into SYNCED status before any further actions.
The most likely you need to bootstrap the cluster to resolve this.
> --
> You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to percona-discuss...@googlegroups.com.
> To post to this group, send email to percona-d...@googlegroups.com.
> Visit this group at https://groups.google.com/group/percona-discussion.
> To view this discussion on the web visit https://groups.google.com/d/msgid/percona-discussion/ebf236df-631c-4430-9573-360145e07ba8%40googlegroups.com.

matthe...@gmail.com

unread,
Feb 23, 2017, 4:04:39 AM2/23/17
to Percona Discussion, explos...@gmail.com, evgeniy...@percona.com
Hi Krunal,

Sorry to re-activate a slightly old thread, but I am slightly confused. The current percona limitations page for XDBC 5.6 or 5.7 has the snippet you identified in which it is saying you should not use backup locks. You seem to be saying the opposite and you should use them rather than FTWRL. Is this simply a case of the documentation from 5.6 being copied to 5.7 and is incorrect?

Thanks

Matthew
Reply all
Reply to author
Forward
0 new messages