web-dev-qa-db-ja.com

GlusterFSが起動時にマウントに失敗する

クライアントとサーバーの両方として機能しているUbuntu 12.04ボックスで公式のGlusterFS 3.5パッケージを実行しています。起動時にGlusterFSボリュームをマウントすることを除いて、すべてが正常に動作しているようです。これは私がログファイルで見るものです:

[2014-06-17 08:20:52.969258] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.0 (/usr/sbin/glusterfs --volfile-server=127.0.0.1 --volfile-id=/public_uploads /var/www/shared/public/uploads)
[2014-06-17 08:20:52.998985] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-06-17 08:20:52.999048] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread
[2014-06-17 08:20:53.000373] E [socket.c:2161:socket_connect_finish] 0-glusterfs: connection to 127.0.0.1:24007 failed (Connection refused)
[2014-06-17 08:20:53.000427] E [glusterfsd-mgmt.c:1601:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-Host: 127.0.0.1 (No data available)
[2014-06-17 08:20:53.000442] I [glusterfsd-mgmt.c:1607:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2014-06-17 08:20:53.013793] W [glusterfsd.c:1095:cleanup_and_exit] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x27) [0x7f686e0160f7] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x1a4) [0x7f686e019cc4] (-->/usr/sbin/glusterfs(+0xcada) [0x7f686e6ddada]))) 0-: received signum (1), shutting down
[2014-06-17 08:20:53.013830] I [Fuse-bridge.c:5444:fini] 0-Fuse: Unmounting '/var/www/shared/public/uploads'.

私のfstabに含まれるもの:

proc        /proc                        proc    defaults                       0       0
/dev/xvda   /                            ext4    noatime,errors=remount-ro      0       1
/dev/xvdb   none                         swap    sw                             0       0
/dev/xvdc   /var/lib/glusterfs/brick01   ext4    defaults                       1       2
127.0.0.1:/private_uploads /var/www/shared/private/uploads glusterfs defaults,_netdev 0 0

これは以前、UbuntuのGlusterFS 3.2のバグであったことを知っていますが、次に示すように、GlusterFS 3.4のPPAパッケージで解決されたことを理解しています: https://bugs.launchpad.net/ubuntu/+source/ glusterfs/+ bug/876648

これは、いくつかの仮想マシンで実行した実験でも機能したことを覚えています(ただし、機能しただけなので、あまり深く調べていませんでした)。 gluster-clientパッケージが、mounting-glusterfs.confと呼ばれるupstartジョブを提供していることがわかります。

author "Louis Zuckerman <[email protected]>"
description "Block the mounting event for glusterfs filesystems until the network interfaces are running"

instance $MOUNTPOINT

start on mounting TYPE=glusterfs
task
exec start wait-for-state WAIT_FOR=static-network-up WAITER=mounting-glusterfs-$MOUNTPOINT

しかし、私はそれがどのように機能するべきかについてはよくわかりません。それは箱から出して動作していないようです。 glusterfsボリュームのマウントはネットワークの開始後に行われますが、GlusterFSの開始前に行われます。

 * Starting RPC portmapper replacement                                   [ OK ]
 * Stopping rpcsec_gss daemon                                            [ OK ]
 * Starting Start this job to wait until rpcbind is started or fails to s[ OK ]
 * Starting configure network device                                     [ OK ]
 * Stopping Start this job to wait until rpcbind is started or fails to s[ OK ]
 * Starting Bridge socket events into upstart                            [ OK ]
 * Starting NSM status monitor                                           [ OK ]
 * Stopping cold plug devices                                            [ OK ]
 * Stopping log initial device creation                                  [ OK ]
 * Starting load fallback graphics devices                               [ OK ]
 * Starting configure network device security                            [ OK ]
 * Starting load fallback graphics devices                               [fail]
 * Starting configure virtual network devices                            [ OK ]
 * Starting Send an event to indicate plymouth is up                     [ OK ]
 * Stopping Send an event to indicate plymouth is up                     [ OK ]
 * Starting Mount network filesystems                                    [ OK ]
 * Stopping configure virtual network devices                            [ OK ]
 * Stopping Mount network filesystems                                    [ OK ]
 * Starting Mount network filesystems                                    [ OK ]
 * Stopping Mount network filesystems                                    [ OK ]
 * Starting configure network device                                     [ OK ]
 * Starting set sysctls from /etc/sysctl.conf                            [ OK ]
 * Stopping set sysctls from /etc/sysctl.conf                            [ OK ]
The disk drive for /var/www/shared/public/uploads is not ready yet or not present.
Continue to wait, or Press S to skip mounting or M for manual recovery
 * Starting Waiting for state                                            [fail]
 * Starting Block the mounting event for glusterfs filesystems until the [fail]k interfaces are running
mountall: Event failed


Mount failed. Please check the log file for more details.
 * Starting GNU Screen Cleanup                                           [ OK ]
 * Starting flush early job output to logs                               [ OK ]
 * Starting base                                                         [ OK ]
 * Starting save udev log and update rules                               [ OK ]
 * Starting OpenSSH server                                               [ OK ]
 * Stopping Failsafe Boot Delay                                          [ OK ]
 * Starting System V initialisation compatibility                        [ OK ]
 * Stopping save udev log and update rules                               [ OK ]
 * Stopping Mount filesystems on boot                                    [ OK ]
 * Stopping GNU Screen Cleanup                                           [ OK ]
 * Stopping flush early job output to logs                               [ OK ]
 * Starting system logging daemon                                        [ OK ]
 * Stopping System V initialisation compatibility                        [ OK ]
 * Starting System V runlevel compatibility                              [ OK ]
 * Starting save kernel messages                                         [ OK ]
 * Starting deferred execution scheduler                                 [ OK ]
 * Starting CPU interrupts balancing daemon                              [ OK ]
 * Starting regular background program processing daemon                 [ OK ]
 * Starting automatic crash report generation                            [ OK ]
 * Starting GlusterFS Management Daemon                                  [ OK ]

何が起こっているのか、そして/またはそれをどのように修正するのか?

私があまり興奮していない代替手段として、私はそれらのボリュームをマウントする新興ジョブを試してみました。 noautoをfstab glusterfsエントリに追加して、ブートアイテムで自動的にマウントされないようにし、次の内容でupstartジョブを作成しました。

description "Mount public uploads"

start on started glusterfs-server

exec mount /var/www/shared/public/uploads

サーバーを再起動したときに、ボリュームがマウントされていませんでした。 /var/log/upstart/mount_public_uploads.logには以下が含まれます:

Mount failed. Please check the log file for more details.

および/var/log/glusterf/var-www-shared-public-uploads.logコンテナ:

2014-06-19 15:01:47.170299] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.0 (/usr/sbin/glusterfs --volfile-server=127.0.0.1 --volfile-id=/public_uploads /var/www/shared/public/uploads)
[2014-06-19 15:01:47.190852] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-06-19 15:01:47.190933] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread
[2014-06-19 15:01:50.613939] I [dht-shared.c:311:dht_init_regex] 0-public_uploads-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$
[2014-06-19 15:01:50.616107] I [socket.c:3561:socket_init] 0-public_uploads-client-0: SSL support is NOT enabled
[2014-06-19 15:01:50.616128] I [socket.c:3576:socket_init] 0-public_uploads-client-0: using system polling thread
[2014-06-19 15:01:50.616158] I [client.c:2273:notify] 0-public_uploads-client-0: parent translators are ready, attempting connect on transport
Final graph:
+------------------------------------------------------------------------------+
  1: volume public_uploads-client-0
  2:     type protocol/client
  3:     option remote-Host koraga.int.example.com
  4:     option remote-subvolume /var/lib/glusterfs/brick01/public_uploads
  5:     option transport-type socket
  6:     option username 51275c7d-33b4-46cc-b8e9-9c06b5dfcda5
  7:     option password 36401ce2-18e7-427e-b126-30d2d9351480
  8:     option transport.socket.ssl-enabled off
  9: end-volume
 10:
 11: volume public_uploads-dht
 12:     type cluster/distribute
 13:     subvolumes public_uploads-client-0
 14: end-volume
 15:
 16: volume public_uploads-write-behind
 17:     type performance/write-behind
 18:     subvolumes public_uploads-dht
 19: end-volume
 20:
 21: volume public_uploads-read-ahead
 22:     type performance/read-ahead
 23:     subvolumes public_uploads-write-behind
 24: end-volume
 25:
 26: volume public_uploads-io-cache
 27:     type performance/io-cache
 28:     subvolumes public_uploads-read-ahead
 29: end-volume
 30:
 31: volume public_uploads-quick-read
 32:     type performance/quick-read
 33:     subvolumes public_uploads-io-cache
 34: end-volume
 35:
 36: volume public_uploads-open-behind
 37:     type performance/open-behind
 38:     subvolumes public_uploads-quick-read
 39: end-volume
 40:
 41: volume public_uploads-md-cache
 42:     type performance/md-cache
 43:     subvolumes public_uploads-open-behind
 44: end-volume
 45:
 46: volume public_uploads
 47:     type debug/io-stats
 48:     option latency-measurement off
 49:     option count-fop-hits off
 50:     subvolumes public_uploads-md-cache
 51: end-volume
 52:
+------------------------------------------------------------------------------+
[2014-06-19 15:01:50.619723] E [client-handshake.c:1742:client_query_portmap_cbk] 0-public_uploads-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2014-06-19 15:01:50.619795] I [client.c:2208:client_rpc_notify] 0-public_uploads-client-0: disconnected from 192.168.134.227:24007. Client process will keep trying to connect to glusterd until brick's port is available
[2014-06-19 15:01:50.629922] I [Fuse-bridge.c:4946:Fuse_graph_setup] 0-Fuse: switched to graph 0
[2014-06-19 15:01:50.630166] I [Fuse-bridge.c:3883:Fuse_init] 0-glusterfs-Fuse: Fuse inited with protocol versions: glusterfs 7.22 kernel 7.22
[2014-06-19 15:01:50.630473] W [Fuse-bridge.c:739:Fuse_attr_cbk] 0-glusterfs-Fuse: 2: LOOKUP() / => -1 (Transport endpoint is not connected)
[2014-06-19 15:01:50.642752] I [Fuse-bridge.c:4787:Fuse_thread_proc] 0-Fuse: unmounting /var/www/shared/public/uploads
[2014-06-19 15:01:50.643121] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f6d5111c3fd] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f6d513efe9a] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xc5) [0x7f6d51ee91b5]))) 0-: received signum (15), shutting down
[2014-06-19 15:01:50.643144] I [Fuse-bridge.c:5444:fini] 0-Fuse: Unmounting '/var/www/shared/public/uploads'.

これが重要な行だと私は思います:

[2014-06-19 15:01:50.619723] E [client-handshake.c:1742:client_query_portmap_cbk] 0-public_uploads-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.

サービスmount_public_uploads startを手動で実行すると、正常にマウントされます。多分それはglusterfsが準備ができる前にマウントしようとしていますか?

8
pupeno

これは既知の問題のようです README.Ubunt によると、Ubuntu 14.04で修正する必要があります。

以前のUbuntuバージョンの考えられる回避策は、GlusterFSサーバーの起動後にカスタムの起動ジョブを使用してボリュームのマウントを延期することです。

4
grekasius

おそらく、PPAをセットアップする前に、Ubuntuのリポジトリからglusterfs-clientをインストールしましたか?そのため、最新バージョンはありません。

0
Michael Gugino