搭建glusterfs+ctdb+samba HA集群过程中遇到的问题 #1

qiulin · 2014-02-21T05:35:35Z

···
[root@gfs2 samba]# ctdb scriptstatus
12 scripts were executed last monitor cycle
00.ctdb Status:OK Duration:0.011 Fri Feb 21 13:15:34 2014
01.reclock Status:OK Duration:0.020 Fri Feb 21 13:15:34 2014
10.interface Status:OK Duration:0.028 Fri Feb 21 13:15:34 2014
11.natgw Status:OK Duration:0.009 Fri Feb 21 13:15:34 2014
11.routing Status:OK Duration:0.009 Fri Feb 21 13:15:34 2014
13.per_ip_routing Status:OK Duration:0.009 Fri Feb 21 13:15:34 2014
20.multipathd Status:OK Duration:0.009 Fri Feb 21 13:15:34 2014
31.clamd Status:OK Duration:0.009 Fri Feb 21 13:15:34 2014
40.vsftpd Status:OK Duration:0.010 Fri Feb 21 13:15:34 2014
41.httpd Status:OK Duration:0.010 Fri Feb 21 13:15:34 2014
50.samba Status:OK Duration:0.107 Fri Feb 21 13:15:34 2014
60.nfs Status:ERROR Duration:0.039 Fri Feb 21 13:15:34 2014
OUTPUT:rpcinfo: RPC: Program not registeredERROR: NFS not responding to rpc requests
···

[root@gfs2 samba]# ctdb status
Number of nodes:3
pnn:0 192.168.121.213 OK
pnn:1 192.168.121.214 UNHEALTHY
pnn:2 192.168.121.215 OK (THIS NODE)
Generation:320197649
Size:3
hash:0 lmaster:0
hash:1 lmaster:1
hash:2 lmaster:2
Recovery mode:NORMAL (0)
Recovery master:1

qiulin · 2014-02-21T07:07:54Z

2014/02/21 14:58:31.827489 [32106]: Release freeze handler for prio 1
2014/02/21 14:58:31.827522 [32106]: Thawing priority 2
2014/02/21 14:58:31.827541 [32106]: Release freeze handler for prio 2
2014/02/21 14:58:31.827576 [32106]: Thawing priority 3
2014/02/21 14:58:31.827597 [32106]: Release freeze handler for prio 3
2014/02/21 14:58:43.401240 [32158]: Trigger takeoverrun
2014/02/21 14:58:52.048137 [32106]: rpcinfo: RPC: Program not registered
2014/02/21 14:58:52.048425 [32106]: ERROR: NFS not responding to rpc requests
2014/02/21 14:58:52.048824 [32106]: Node became UNHEALTHY. Ask recovery master 1 to perform ip reallocation
2014/02/21 14:58:53.425502 [32158]: Trigger takeoverrun
2014/02/21 14:58:57.488836 [32106]: Node became HEALTHY. Ask recovery master 1 to perform ip reallocation
2014/02/21 14:58:59.440842 [32158]: Trigger takeoverrun
2014/02/21 14:59:13.330952 [32106]: rpcinfo: RPC: Program not registered
2014/02/21 14:59:13.331285 [32106]: ERROR: NFS not responding to rpc requests
2014/02/21 14:59:13.331628 [32106]: Node became UNHEALTHY. Ask recovery master 1 to perform ip reallocation
2014/02/21 14:59:14.051789 [32106]: Killing TCP connection 192.168.131.164:36703 192.168.121.235:445
2014/02/21 14:59:14.094761 [32106]: killed 1 TCP connections to released IP 192.168.121.235
2014/02/21 14:59:14.477872 [32158]: Trigger takeoverrun
2014/02/21 14:59:18.769927 [32106]: Node became HEALTHY. Ask recovery master 1 to perform ip reallocation
2014/02/21 14:59:20.493635 [32158]: Trigger takeoverrun
2014/02/21 14:59:34.602809 [32106]: rpcinfo: RPC: Program not registered
2014/02/21 14:59:34.603139 [32106]: ERROR: NFS not responding to rpc requests
2014/02/21 14:59:34.603483 [32106]: Node became UNHEALTHY. Ask recovery master 1 to perform ip reallocation
2014/02/21 14:59:35.321162 [32106]: Killing TCP connection 192.168.131.164:36704 192.168.121.235:445
2014/02/21 14:59:35.365391 [32106]: killed 1 TCP connections to released IP 192.168.121.235
2014/02/21 14:59:36.533835 [32158]: Trigger takeoverrun
2014/02/21 14:59:40.044422 [32106]: Node became HEALTHY. Ask recovery master 1 to perform ip reallocation
2014/02/21 14:59:41.546972 [32158]: Trigger takeoverrun
2014/02/21 14:59:55.874704 [32106]: rpcinfo: RPC: Program not registered
2014/02/21 14:59:55.875025 [32106]: ERROR: NFS not responding to rpc requests
2014/02/21 14:59:55.875376 [32106]: Node became UNHEALTHY. Ask recovery master 1 to perform ip reallocation
2014/02/21 14:59:56.583229 [32158]: We are still serving a public address '192.168.121.235' that we should not be serving.
2014/02/21 14:59:56.583311 [32158]: We are still serving a public address '192.168.121.234' that we should not be serving.
2014/02/21 14:59:56.583346 [32158]: Trigger takeoverrun
2014/02/21 14:59:56.595685 [32106]: Killing TCP connection 192.168.131.164:36706 192.168.121.235:445
2014/02/21 14:59:56.638483 [32106]: killed 1 TCP connections to released IP 192.168.121.235
2014/02/21 14:59:57.585901 [32158]: Trigger takeoverrun
2014/02/21 15:00:01.314901 [32106]: Node became HEALTHY. Ask recovery master 1 to perform ip reallocation
2014/02/21 15:00:02.609237 [32158]: Trigger takeoverrun

qiulin · 2014-02-24T04:58:02Z

OUTPUT:rpcinfo: RPC: Program not registeredERROR: NFS not responding to rpc requests

qiulin · 2014-02-26T08:49:23Z

50.samba Status:ERROR Duration:0.088 Wed Feb 26 16:48:01 2014
OUTPUT:ERROR: winbind - wbinfo -p returned error

qiulin self-assigned this Feb 21, 2014

qiulin added the question label Feb 21, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

搭建glusterfs+ctdb+samba HA集群过程中遇到的问题 #1

搭建glusterfs+ctdb+samba HA集群过程中遇到的问题 #1

qiulin commented Feb 21, 2014

qiulin commented Feb 21, 2014

qiulin commented Feb 24, 2014

qiulin commented Feb 26, 2014

搭建glusterfs+ctdb+samba HA集群过程中遇到的问题 #1

搭建glusterfs+ctdb+samba HA集群过程中遇到的问题 #1

Comments

qiulin commented Feb 21, 2014

qiulin commented Feb 21, 2014

qiulin commented Feb 24, 2014

qiulin commented Feb 26, 2014