web-dev-qa-db-ja.com

Galera bootstrapエラーなしで失敗する

CentOS7でMariaDB10.1.5(Galera 25.3.9を含む)を使用して、Galeraクラスターを起動して実行しようと、週末中ずっと髪を引っ張っています。このサーバーは2つのデータベースインスタンスを実行します。クラスターの一部。

私はこの正確なセットアップを何度も問題なく実行しましたが、今は無知です...

my.cnf:

[mysqld_multi]
mysqld=/usr/bin/mysqld_safe --basedir=/usr
user=mysql

#
# Server local DB
#
[mysqld1]
mysqld=/usr/bin/mysqld_safe --basedir=/usr
user=mysql
port=3307
pid-file=/storage/local/mysqld.pid
socket=/storage/local/mysqld.sock
datadir=/storage/local
log-error=/var/log/mariadb-local.log

#
# Server cluster DB
#
[mysqld2]
wsrep_new_cluster
wsrep_cluster_address=gcomm://
#wsrep_cluster_address="gcomm://172.16.1.91,172.16.1.94,172.16.1.95"
wsrep_debug=1
query_cache_size=0
log-bin=/var/log/mariadb/cluster-bin.log
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_name="MyApps"
wsrep_node_address=172.16.1.91
wsrep_node_name="dbnode1"
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:sstpass
mysqld=/usr/bin/mysqld_safe --basedir=/usr
user=mysql
port=3306
pid-file=/storage/cluster/mysqld.pid
socket=/storage/cluster/mysqld.sock
datadir=/storage/cluster
log-error=/var/log/mariadb-cluster.log

このノードを起動すると、ログに次のように表示されます。

2015-08-02 20:52:52 mysqld_safe Starting mysqld daemon with databases from /storage/cluster
2015-08-02 20:52:52 140344148072576 [Note] /usr/sbin/mysqld (mysqld 10.1.5-MariaDB-log) starting as process 11301 ...
2015-08-02 20:52:52 140344148072576 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2015-08-02 20:52:52 140344148072576 [Note] InnoDB: The InnoDB memory heap is disabled
2015-08-02 20:52:52 140344148072576 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-08-02 20:52:52 140344148072576 [Note] InnoDB: Memory barrier is not used
2015-08-02 20:52:52 140344148072576 [Note] InnoDB: Compressed tables use zlib 1.2.7
2015-08-02 20:52:52 140344148072576 [Note] InnoDB: Using Linux native AIO
2015-08-02 20:52:52 140344148072576 [Note] InnoDB: Not using CPU crc32 instructions
2015-08-02 20:52:53 140344148072576 [Note] InnoDB: Initializing buffer pool, size = 128.0M
2015-08-02 20:52:53 140344148072576 [Note] InnoDB: Completed initialization of buffer pool
2015-08-02 20:52:53 140344148072576 [Note] InnoDB: Highest supported file format is Barracuda.
2015-08-02 20:52:53 140344148072576 [Note] Plugin 'FEEDBACK' is disabled.
2015-08-02 20:52:53 140233078593664 [Note] Recovering after a crash using tc.log
2015-08-02 20:52:53 140233078593664 [Note] Starting crash recovery...
2015-08-02 20:52:53 140233078593664 [Note] Crash recovery finished.
2015-08-02 20:52:53 140232389265152 [Note] InnoDB: not started
2015-08-02 20:52:53 140344148072576 [Note] Server socket created on IP: '::'.
2015-08-02 20:52:53 140344148072576 [Note] Event Scheduler: Loaded 0 events
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Setting wsrep_ready to 0
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Read WSREPXid from InnoDB:  00000000-0000-0000-0000-000000000000:-1
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Read nil XID from storage engines, skipping position init
2015-08-02 20:52:53 140344148072576 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
2015-08-02 20:52:53 140344148072576 [Note] WSREP: wsrep_load(): Galera 25.3.9(r3387) by Codership Oy <[email protected]> loaded successfully.
2015-08-02 20:52:53 140344148072576 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Passing config to GCS: base_Host = 172.16.1.91; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /storage/cluster/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /storage/cluster//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.reco
2015-08-02 20:52:53 140343261148928 [Note] WSREP: Service thread queue flushed.
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Start replication
2015-08-02 20:52:53 140344148072576 [Note] WSREP: 'wsrep-new-cluster' option used, bootstrapping the cluster
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2015-08-02 20:52:53 140344148072576 [Note] WSREP: protonet asio version 0
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Using CRC-32C for message checksums.
2015-08-02 20:52:53 140344148072576 [Note] WSREP: backend: asio
2015-08-02 20:52:53 140344148072576 [Warning] WSREP: access file(gvwstate.dat) failed(No such file or directory)
2015-08-02 20:52:53 140344148072576 [Note] WSREP: restore pc from disk failed
2015-08-02 20:52:53 140344148072576 [Note] WSREP: GMCast version 0
2015-08-02 20:52:53 140344148072576 [Note] WSREP: (71c20f80, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2015-08-02 20:52:53 140344148072576 [Note] WSREP: (71c20f80, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2015-08-02 20:52:53 140344148072576 [Note] WSREP: EVS version 0
2015-08-02 20:52:53 140344148072576 [Note] WSREP: gcomm: bootstrapping new group 'MyApps'
2015-08-02 20:52:53 140344148072576 [Note] WSREP: start_prim is enabled, turn off pc_recovery
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Node 71c20f80 state prim
2015-08-02 20:52:53 140344148072576 [Note] WSREP: view(view_id(PRIM,71c20f80,1) memb {
    71c20f80,0
} joined {
} left {
} partitioned {
})
2015-08-02 20:52:53 140344148072576 [Note] WSREP: save pc into disk
2015-08-02 20:52:53 140344148072576 [Note] WSREP: gcomm: connected
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2015-08-02 20:52:53 140344148072576 [Note] WSREP: Opened channel 'MyApps'
2015-08-02 20:52:53 140343215384320 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
2015-08-02 20:52:53 140343215384320 [Note] WSREP: Starting new group from scratch: 71c26ca7-3958-11e5-aa0e-92c6feb108bf
2015-08-02 20:52:53 140343215384320 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 71c2791f-3958-11e5-9fcc-52d3a70422f3
2015-08-02 20:52:53 140343215384320 [Note] WSREP: STATE EXCHANGE: sent state msg: 71c2791f-3958-11e5-9fcc-52d3a70422f3
2015-08-02 20:52:53 140343215384320 [Note] WSREP: STATE EXCHANGE: got state msg: 71c2791f-3958-11e5-9fcc-52d3a70422f3 from 0 (dbnode1)
2015-08-02 20:52:53 140343215384320 [Note] WSREP: Quorum results:
    version    = 3,
    component  = PRIMARY,
    conf_id    = 0,
    members    = 1/1 (joined/total),
    act_id     = 0,
    last_appl. = -1,
    protocols  = 0/7/3 (gcs/repl/appl),
    group UUID = 71c26ca7-3958-11e5-aa0e-92c6feb108bf
2015-08-02 20:52:53 140343215384320 [Note] WSREP: Flow-control interval: [16, 16]
2015-08-02 20:52:53 140343215384320 [Note] WSREP: Restored state OPEN -> JOINED (0)
2015-08-02 20:52:53 140343215384320 [Note] WSREP: Member 0.0 (dbnode1) synced with group.
2015-08-02 20:52:53 140343215384320 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2015-08-02 20:52:53 140344147060480 [Note] WSREP: New cluster view: global state: 71c26ca7-3958-11e5-aa0e-92c6feb108bf:0, view# 1: Primary, number of nodes: 1, my index: 0, protocol version 3
2015-08-02 20:52:53 140344148072576 [Note] /usr/sbin/mysqld: ready for connections.

これは、すべてが正常であることを示していますが、wsrep変数をチェックすると、WSREP_READY = OFFであり、エンジンにアクセスしようとすると、

ERROR 1047 (08S01): WSREP has not yet prepared node for application use

ログは正常に見え、エラーはありません(シャットダウンに時間がかかりすぎるため、おそらく何かがハングしているために発生するクラッシュリカバリを除く)、私が心配しているのは

WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1

奇妙だと思うのは、クラスターをブートストラップしていることです。これはbootstrap = yesではないでしょうか?

1
WanWizard

Mariadb 10.1に付属のsystemdユニットファイルは、ブートストラップガレラをまだサポートしていません。

回避策として、次のものを使用できます。

[root@galera01 mysql]# systemctl set-environment MYSQLD_OPTS="--wsrep-new-cluster"
[root@galera01 mysql]# systemctl start mariadb
[root@galera01 mysql]# systemctl unset-environment MYSQLD_OPTS

これは、問題を追跡するmariadbのバグです: https://mariadb.atlassian.net/browse/MDEV-7752

1
Sven Lankes

これはあなたが期待している解決策ではないかもしれませんが、 Percona XtraDB Cluster の使用を検討しましたか?これは、Galleraライブラリにバンドルおよび構成されたPerconaサーバー(MySQLサーバーの完全に互換性のある代替品)です。

MariaDBに厳密な場合は、いつでもデフォルトのPerconaServer構成ファイルを参照として使用できます。

0
sam_pan_mariusz