drbd1为主,drbd2为辅;
1、断开primary
down机或是断开网线
2、查看secondary机器的状态
#注意下drbd2的cs状态
1: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
ns:567256 nr:20435468 dw:21002724 dr:169 al:229 bm:1248 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
将secondary配置成primary角色
#挂载
#原来的primary机器好了,出现脑裂了。
#drbd1现在是standalone,这个时候,主跟辅是不会相互联系的。
这个时候,如果用户有尝试把drbd2的drbd服务重启的话,你就会发现根本无法起来!
在drbd2处理方法:
做完以上三步,你发现你仍然无法启动drbd2上的drbd服务;
需要在drbd1上重连接资源:
再次启动drbd2上的drbd服务,成了。
再看看资源同步:
补充:虽然是手工模拟但在故障切换时也会出一样的问题。
1、DRBD的资源只能在或主或辅的一台机器上挂载。
2、在做主辅的手工切换时的步骤:
a、先将原来挂载的东西进行卸载,这个时候你的应用会停,不建议手工切换主辅
b、将原来的主设置成辅 #drbdadm secondary resource_name
c、将原来的辅设置成主 #drbdadm primary resource_name
d、挂载资源
转载:http://myhat.blog.51cto.com/391263/606318/
1、断开primary
down机或是断开网线
2、查看secondary机器的状态
[root@drbd2 ~]# drbdadm role fs
Secondary/Unknown
[root@drbd2 ~]# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd2.localdomain, 2011-07-08 11:10:20
Secondary/Unknown
[root@drbd2 ~]# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd2.localdomain, 2011-07-08 11:10:20
#注意下drbd2的cs状态
1: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
ns:567256 nr:20435468 dw:21002724 dr:169 al:229 bm:1248 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
将secondary配置成primary角色
[root@drbd2 ~]# drbdadm primary fs
[root@drbd2 ~]# drbdadm role fs
Primary/Unknown
[root@drbd2 ~]# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd2.localdomain, 2011-07-08 11:10:20
1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
ns:567256 nr:20435468 dw:21002724 dr:169 al:229 bm:1248 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@drbd2 ~]# drbdadm role fs
Primary/Unknown
[root@drbd2 ~]# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd2.localdomain, 2011-07-08 11:10:20
1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
ns:567256 nr:20435468 dw:21002724 dr:169 al:229 bm:1248 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
#挂载
[root@drbd2 ~]# mount /dev/drbd1 /mnt/
[root@drbd2 ~]# cd /mnt/
[root@drbd2 mnt]# ll
total 102524
-rw-r--r-- 1 root root 104857600 Jul 8 12:35 100M
drwx------ 2 root root 16384 Jul 8 12:33 lost+found
[root@drbd2 ~]# cd /mnt/
[root@drbd2 mnt]# ll
total 102524
-rw-r--r-- 1 root root 104857600 Jul 8 12:35 100M
drwx------ 2 root root 16384 Jul 8 12:33 lost+found
#原来的primary机器好了,出现脑裂了。
[root@drbd1 ~]# tail -f /var/log/messages
Jul 8 13:14:01 localhost kernel: block drbd1: helper command: /sbin/drbdadm initial-split-brain minor-1 exit code 0 (0x0)
Jul 8 13:14:01 localhost kernel: block drbd1: Split-Brain detected but unresolved, dropping connection!
Jul 8 13:14:01 localhost kernel: block drbd1: helper command: /sbin/drbdadm split-brain minor-1
Jul 8 13:14:01 localhost kernel: block drbd1: helper command: /sbin/drbdadm split-brain minor-1 exit code 0 (0x0)
Jul 8 13:14:01 localhost kernel: block drbd1: conn( NetworkFailure -> Disconnecting )
Jul 8 13:14:01 localhost kernel: block drbd1: error receiving ReportState, l: 4!
Jul 8 13:14:01 localhost kernel: block drbd1: Connection closed
Jul 8 13:14:01 localhost kernel: block drbd1: conn( Disconnecting -> StandAlone )
Jul 8 13:14:01 localhost kernel: block drbd1: receiver terminated
Jul 8 13:14:01 localhost kernel: block drbd1: Terminating receiver thread
[root@drbd1 ~]# drbdadm role fs
Primary/Unknown
[root@drbd2 mnt]# drbdadm role fs
Primary/Unknown
Jul 8 13:14:01 localhost kernel: block drbd1: helper command: /sbin/drbdadm initial-split-brain minor-1 exit code 0 (0x0)
Jul 8 13:14:01 localhost kernel: block drbd1: Split-Brain detected but unresolved, dropping connection!
Jul 8 13:14:01 localhost kernel: block drbd1: helper command: /sbin/drbdadm split-brain minor-1
Jul 8 13:14:01 localhost kernel: block drbd1: helper command: /sbin/drbdadm split-brain minor-1 exit code 0 (0x0)
Jul 8 13:14:01 localhost kernel: block drbd1: conn( NetworkFailure -> Disconnecting )
Jul 8 13:14:01 localhost kernel: block drbd1: error receiving ReportState, l: 4!
Jul 8 13:14:01 localhost kernel: block drbd1: Connection closed
Jul 8 13:14:01 localhost kernel: block drbd1: conn( Disconnecting -> StandAlone )
Jul 8 13:14:01 localhost kernel: block drbd1: receiver terminated
Jul 8 13:14:01 localhost kernel: block drbd1: Terminating receiver thread
[root@drbd1 ~]# drbdadm role fs
Primary/Unknown
[root@drbd2 mnt]# drbdadm role fs
Primary/Unknown
#drbd1现在是standalone,这个时候,主跟辅是不会相互联系的。
[root@drbd1 ~]# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd1.localdomain, 2011-07-08 11:10:38
1: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r-----
ns:20405516 nr:567256 dw:567376 dr:20405706 al:2 bm:1246 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@drbd1 /]# service drbd status
drbd driver loaded OK; device status:
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd1.localdomain, 2011-07-08 11:10:38
m:res cs ro ds p mounted fstype
1:fs StandAlone Primary/Unknown UpToDate/DUnknown r----- ext3
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd1.localdomain, 2011-07-08 11:10:38
1: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r-----
ns:20405516 nr:567256 dw:567376 dr:20405706 al:2 bm:1246 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@drbd1 /]# service drbd status
drbd driver loaded OK; device status:
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd1.localdomain, 2011-07-08 11:10:38
m:res cs ro ds p mounted fstype
1:fs StandAlone Primary/Unknown UpToDate/DUnknown r----- ext3
这个时候,如果用户有尝试把drbd2的drbd服务重启的话,你就会发现根本无法起来!
[root@drbd2 /]# service drbd start
Starting DRBD resources: [ ]..........
***************************************************************
DRBD's startup script waits for the peer node(s) to appear.
- In case this node was already a degraded cluster before the
reboot the timeout is 120 seconds. [degr-wfc-timeout]
- If the peer was available before the reboot the timeout will
expire after 0 seconds. [wfc-timeout]
(These values are for resource 'fs'; 0 sec -> wait forever)
To abort waiting enter 'yes' [ -- ]:[ 13]:[ 15]:[ 16]:[ 18]:[ 19]:[ 20]:[ 22]:
Starting DRBD resources: [ ]..........
***************************************************************
DRBD's startup script waits for the peer node(s) to appear.
- In case this node was already a degraded cluster before the
reboot the timeout is 120 seconds. [degr-wfc-timeout]
- If the peer was available before the reboot the timeout will
expire after 0 seconds. [wfc-timeout]
(These values are for resource 'fs'; 0 sec -> wait forever)
To abort waiting enter 'yes' [ -- ]:[ 13]:[ 15]:[ 16]:[ 18]:[ 19]:[ 20]:[ 22]:
在drbd2处理方法:
[root@drbd2 /]# drbdadm disconnect fs
[root@drbd2 /]# drbdadm secondary fs
[root@drbd2 /]# drbdadm -- --discard-my-data fs
[root@drbd2 /]# drbdadm secondary fs
[root@drbd2 /]# drbdadm -- --discard-my-data fs
做完以上三步,你发现你仍然无法启动drbd2上的drbd服务;
需要在drbd1上重连接资源:
[root@drbd1 ~]# drbdadm connect fs
再次启动drbd2上的drbd服务,成了。
[root@drbd2 /]# service drbd start
Starting DRBD resources: [ ].
Starting DRBD resources: [ ].
再看看资源同步:
[root@drbd2 /]# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd2.localdomain, 2011-07-08 11:10:20
1: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
ns:0 nr:185532 dw:185532 dr:0 al:0 bm:15 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:299000
[======>.............] sync'ed: 39.5% (299000/484532)K
finish: 0:00:28 speed: 10,304 (10,304) want: 10,240 K/sec
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@drbd2.localdomain, 2011-07-08 11:10:20
1: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
ns:0 nr:185532 dw:185532 dr:0 al:0 bm:15 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:299000
[======>.............] sync'ed: 39.5% (299000/484532)K
finish: 0:00:28 speed: 10,304 (10,304) want: 10,240 K/sec
补充:虽然是手工模拟但在故障切换时也会出一样的问题。
1、DRBD的资源只能在或主或辅的一台机器上挂载。
2、在做主辅的手工切换时的步骤:
a、先将原来挂载的东西进行卸载,这个时候你的应用会停,不建议手工切换主辅
b、将原来的主设置成辅 #drbdadm secondary resource_name
c、将原来的辅设置成主 #drbdadm primary resource_name
d、挂载资源
转载:http://myhat.blog.51cto.com/391263/606318/