renay****@ybb*****
renay****@ybb*****
2015年 3月 17日 (火) 22:38:22 JST
福田さん こんばんは、山内です。 ちなみに可能であれば、external/stonith-helperを外して、external/xen0だけにした場合に どうなるか?を確認すると、問題の切り分けになるかもしれません。 以上です。 ----- Original Message ----- > From: "renay****@ybb*****" <renay****@ybb*****> > To: "linux****@lists*****" <linux****@lists*****> > Cc: > Date: 2015/3/17, Tue 22:28 > Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて > > 福田さん > > こんばんは、山内です。 > > 変わらないようですね。。。 > > とりあえず、明日くらいに、RHEL上ですが、 > > Heartbeat3.0.6 > Pacemakerの最新 > > 組み合わせで、同じような設定(リソースはDummy、external/xen0はexternal/sshになりますが)stonith-helperが動くかどうかを確認してみます。 > > #stonith-helperの-x指定の出力が確認出来ると、もう少し問題が絞りやすいのですが・・・ > > > 以上です。 > > > > ----- Original Message ----- >> From: Masamichi Fukuda - elf-systems > <masamichi_fukud****@elf-s*****> >> To: 山内英生 <renay****@ybb*****>; > "linux****@lists*****" > <linux****@lists*****> >> Date: 2015/3/17, Tue 21:24 >> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて >> >> >> 山内さん >> >> こんばんは、福田です。 >> 最新版の情報をありがとうございました。 >> >> 早速インストールしてみました。 >> >> 起動後の状態です。 >> >> failed actionsは変わりないようです。 >> >> >> >> # crm_mon -rfA >> Last updated: Tue Mar 17 21:03:49 2015 >> Last change: Tue Mar 17 20:30:58 2015 >> Stack: heartbeat >> Current DC: lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6) - parti >> tion with quorum >> Version: 1.1.12-e32080b >> 2 Nodes configured >> 8 Resources configured >> >> >> Online: [ lbv1.beta.com lbv2.beta.com ] >> >> Full list of resources: >> >> Resource Group: HAvarnish >> vip_208 (ocf::heartbeat:IPaddr2): Started lbv1.beta.com >> varnishd (lsb:varnish): Started lbv1.beta.com >> Resource Group: grpStonith1 >> Stonith1-1 (stonith:external/stonith-helper): Stopped >> Stonith1-2 (stonith:external/xen0): Stopped >> Resource Group: grpStonith2 >> Stonith2-1 (stonith:external/stonith-helper): Stopped >> Stonith2-2 (stonith:external/xen0): Stopped >> Clone Set: clone_ping [ping] >> Started: [ lbv1.beta.com lbv2.beta.com ] >> >> Node Attributes: >> * Node lbv1.beta.com: >> + default_ping_set : 100 >> * Node lbv2.beta.com: >> + default_ping_set : 100 >> >> Migration summary: >> * Node lbv1.beta.com: >> Stonith2-1: migration-threshold=1 fail-count=1000000 > last-failure='Tue Mar 17 >> 21:03:39 2015' >> * Node lbv2.beta.com: >> Stonith1-1: migration-threshold=1 fail-count=1000000 > last-failure='Tue Mar 17 >> 21:03:32 2015' >> >> Failed actions: >> Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1): > call=31, st >> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 > 21:03:37 2015', queue >> d=0ms, exec=1085ms >> Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1): > call=18, st >> atus=Error, exit-reason='none', last-rc-change='Tue Mar 17 > 21:03:30 2015', queue >> d=0ms, exec=1061ms >> >> >> >> >> ログです。 >> >> >> # less /var/log/ha-debug >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Pacemaker support: > yes >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: File > /etc/ha.d//haresources exists. >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: This file is not used > because pacemaker is enabled >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: > /usr/local/heartbeat/libexec/heartbeat/ccm >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: > /usr/local/heartbeat/libexec/pacemaker/cib >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: > /usr/local/heartbeat/libexec/pacemaker/stonithd >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: > /usr/local/heartbeat/libexec/pacemaker/lrmd >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: > /usr/local/heartbeat/libexec/pacemaker/attrd >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: debug: Checking access of: > /usr/local/heartbeat/libexec/pacemaker/crmd >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Core dumps could be > lost if multiple dumps occur. >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting > non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum > supportability >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Consider setting > /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: WARN: Logging daemon is > disabled --enabling logging daemon is recommended >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: > ************************** >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4235]: info: Configuration > validated. Starting heartbeat 3.0.6 >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: heartbeat: version > 3.0.6 >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Heartbeat generation: > 1423534116 >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: seed is -1702799346 >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: write > socket priority set to IPTOS_LOWDELAY on eth1 >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound > send socket to device: eth1 >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: set > SO_REUSEADDR >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: bound > receive socket to device: eth1 >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: glib: ucast: started > on port 694 interface eth1 to 10.0.17.133 >> Mar 17 21:02:39 lbv1.beta.com heartbeat: [4236]: info: Local status now set > to: 'up' >> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Link > lbv2.beta.com:eth1 up. >> Mar 17 21:02:46 lbv1.beta.com heartbeat: [4236]: info: Status update for > node lbv2.beta.com: status up >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Comm_now_up(): > updating status to active >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Local status now set > to: 'active' >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client > "/usr/local/heartbeat/libexec/heartbeat/ccm" (109,113) >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client > "/usr/local/heartbeat/libexec/pacemaker/cib" (109,113) >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0) >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client > "/usr/local/heartbeat/libexec/pacemaker/lrmd" (0,0) >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client > "/usr/local/heartbeat/libexec/pacemaker/attrd" (109,113) >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Starting child client > "/usr/local/heartbeat/libexec/pacemaker/crmd" (109,113) >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: debug: get_delnodelist: > delnodelist= >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4250]: info: Starting > "/usr/local/heartbeat/libexec/pacemaker/crmd" as uid 109 gid 113 (pid > 4250) >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4246]: info: Starting > "/usr/local/heartbeat/libexec/pacemaker/cib" as uid 109 gid 113 (pid > 4246) >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4249]: info: Starting > "/usr/local/heartbeat/libexec/pacemaker/attrd" as uid 109 gid 113 > (pid 4249) >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4245]: info: Starting > "/usr/local/heartbeat/libexec/heartbeat/ccm" as uid 109 gid 113 (pid > 4245) >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4248]: info: Starting > "/usr/local/heartbeat/libexec/pacemaker/lrmd" as uid 0 gid 0 (pid > 4248) >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4247]: info: Starting > "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 gid 0 (pid > 4247) >> Mar 17 21:02:47 lbv1.beta.com ccm: [4245]: info: Hostname: lbv1.beta.com >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length > from heartbeat to client ccm is set to 1024 >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length > from heartbeat to client attrd is set to 1024 >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length > from heartbeat to client stonith-ng is set to 1024 >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: Status update for > node lbv2.beta.com: status active >> Mar 17 21:02:47 lbv1.beta.com heartbeat: [4236]: info: the send queue length > from heartbeat to client cib is set to 1024 >> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for > [lbv2.beta.com] [15:17] >> Mar 17 21:02:51 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from > lbv2.beta.com! >> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for > [lbv2.beta.com] [19:21] >> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from > lbv2.beta.com! >> Mar 17 21:02:52 lbv1.beta.com heartbeat: [4236]: info: the send queue length > from heartbeat to client crmd is set to 1024 >> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for > [lbv2.beta.com] [24:26] >> Mar 17 21:02:53 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from > lbv2.beta.com! >> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for > [lbv2.beta.com] [26:28] >> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from > lbv2.beta.com! >> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: WARN: 1 lost packet(s) for > [lbv2.beta.com] [30:32] >> Mar 17 21:02:54 lbv1.beta.com heartbeat: [4236]: info: No pkts missing from > lbv2.beta.com! >> >> >> >> # less /var/log/error >> >> Mar 17 21:02:47 lbv1 attrd[4249]: error: ha_msg_dispatch: Ignored > incoming message. Please set_msg_callback on hbclstat >> Mar 17 21:02:48 lbv1 attrd[4249]: error: ha_msg_dispatch: Ignored > incoming message. Please set_msg_callback on hbclstat >> Mar 17 21:02:53 lbv1 stonith-ng[4247]: error: ha_msg_dispatch: Ignored > incoming message. Please set_msg_callback on hbclstat >> Mar 17 21:02:53 lbv1 stonith-ng[4247]: error: ha_msg_dispatch: Ignored > incoming message. Please set_msg_callback on hbclstat >> Mar 17 21:03:39 lbv1 crmd[4250]: error: process_lrm_event: Operation > Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42, > confirmed=true) Error >> >> # cat syslog|egrep 'Mar 17 21:03|Mar 17 21:02' |egrep > 'heartbeat|stonith|pacemaker|error' >> Mar 17 21:03:24 lbv1 pengine[4253]: notice: process_pe_message: Calculated > Transition 0: /var/lib/pacemaker/pengine/pe-input-115.bz2 >> Mar 17 21:03:27 lbv1 crmd[4250]: notice: run_graph: Transition 0 > (Complete=15, Pending=0, Fired=0, Skipped=16, Incomplete=2, > Source=/var/lib/pacemaker/pengine/pe-input-115.bz2): Stopped >> Mar 17 21:03:29 lbv1 pengine[4253]: notice: process_pe_message: Calculated > Transition 1: /var/lib/pacemaker/pengine/pe-input-116.bz2 >> Mar 17 21:03:34 lbv1 crmd[4250]: notice: run_graph: Transition 1 > (Complete=8, Pending=0, Fired=0, Skipped=12, Incomplete=1, > Source=/var/lib/pacemaker/pengine/pe-input-116.bz2): Stopped >> Mar 17 21:03:37 lbv1 pengine[4253]: warning: unpack_rsc_op_failure: > Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1) >> Mar 17 21:03:37 lbv1 pengine[4253]: warning: unpack_rsc_op_failure: > Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1) >> Mar 17 21:03:37 lbv1 pengine[4253]: notice: process_pe_message: Calculated > Transition 2: /var/lib/pacemaker/pengine/pe-input-117.bz2 >> Mar 17 21:03:39 lbv1 stonith-ng[4247]: notice: log_operation: Operation > 'monitor' [4377] for device 'Stonith2-1' returned: -201 (Generic > Pacemaker error) >> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation: > Stonith2-1:4377 [ Performing: stonith -t external/stonith-helper -S ] >> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation: > Stonith2-1:4377 [ failed to exec "stonith" ] >> Mar 17 21:03:39 lbv1 stonith-ng[4247]: warning: log_operation: > Stonith2-1:4377 [ failed: 2 ] >> Mar 17 21:03:39 lbv1 crmd[4250]: error: process_lrm_event: Operation > Stonith2-1_start_0 (node=lbv1.beta.com, call=31, status=4, cib-update=42, > confirmed=true) Error >> Mar 17 21:03:40 lbv1 crmd[4250]: notice: run_graph: Transition 2 > (Complete=12, Pending=0, Fired=0, Skipped=3, Incomplete=0, > Source=/var/lib/pacemaker/pengine/pe-input-117.bz2): Stopped >> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure: > Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1) >> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure: > Processing failed op start for Stonith2-1 on lbv1.beta.com: unknown error (1) >> Mar 17 21:03:42 lbv1 pengine[4253]: warning: unpack_rsc_op_failure: > Processing failed op start for Stonith1-1 on lbv2.beta.com: unknown error (1) >> Mar 17 21:03:42 lbv1 pengine[4253]: notice: process_pe_message: Calculated > Transition 3: /var/lib/pacemaker/pengine/pe-input-118.bz2 >> Mar 17 21:03:42 lbv1 IPaddr2(vip_208)[4448]: INFO: > /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p > /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto > not_used not_used >> Mar 17 21:03:47 lbv1 crmd[4250]: notice: run_graph: Transition 3 > (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pacemaker/pengine/pe-input-118.bz2): Complete >> >> 宜しくお願いします。 >> >> 以上 >> >> >> >> 2015年3月17日 18:31 <renay****@ybb*****>: >> >> 福田さん >>> >>> こんばんは、山内です。 >>> >>> tag付けされていないので、本日の最新版は、 >>> >>> * > https://github.com/ClusterLabs/pacemaker/tree/e32080b460f81486b85d08ec958582b3e72d858c >>> >>> >>> になります。 >>> 右側の[Download ZIP]からダウンロード出来ます。 >>> >>> 以上です。 >>> >>> >>> ----- Original Message ----- >>>> From: Masamichi Fukuda - elf-systems > <masamichi_fukud****@elf-s*****> >>> >>>> To: "renay****@ybb*****" > <renay****@ybb*****>; > "linux****@lists*****" > <linux****@lists*****> >>>> Date: 2015/3/17, Tue 18:07 >>>> Subject: スプリットブレイン時のSTONITHエラーについて >>>> >>>> >>>> 山内さん >>>> >>>> >>>> お疲れ様です、福田です。 >>>> >>>> >>>> こちらを見たのですが、 >>>> https://github.com/ClusterLabs/pacemaker/tags >>>> >>>> >>>> >>>> pacemaker 1.1.12 561c4cf が最新のようなのですが。 >>>> 済みませんが、これ以降の最新版はどちらにあるか教えて頂けますか。 >>>> >>>> >>>> 宜しくお願いします。 >>>> >>>> >>>> 以上 >>>> >>>> >>>> >>>> 2015年3月17日火曜日、<renay****@ybb*****>さんは書きました: >>>> >>>> 福田さん >>>>> >>>>> お疲れ様です。山内です。 >>>>> >>>>> はい。古いです。 >>>>> >>>>> PacemakerがHeartbeat3.0.6に対応したのは意外と最近です。 >>>>> もっと新しいものを入れてください。(また、ソースから構築する必要がありますが・・・・) >>>>> >>>>> >>>>> >>>>> 本家のgithubから入手可能です。 >>>>> * https://github.com/ClusterLabs/pacemaker >>>>> >>>>> >>>>> 場合によっては、最新のmasterはエラーなどが出る場合がありますので、その場合は、バージョンを古い方にたぐって >>>>> いくのが良いと思います。 >>>>> >>>>> 以上です。 >>>>> >>>>> >>>>> >>>>> ----- Original Message ----- >>>>>> From: Masamichi Fukuda - elf-systems > <masamichi_fukud****@elf-s*****> >>>>>> To: 山内英生 <renay****@ybb*****>; > "linux****@lists*****" > <linux****@lists*****> >>>>>> Date: 2015/3/17, Tue 16:06 >>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて >>>>>> >>>>>> >>>>>> 山内さん >>>>>> >>>>>> お疲れ様です、福田です。 >>>>>> >>>>>> 以前のメールでheartbeatとpacemakerを最新版を入れたほうが良いと回答頂きました。 >>>>>> そこで今回、heartbeat3.0.6とpacemaker1.1.12を入れたのですが。 >>>>>> >>>>>> heartbeat configuration: Version = "3.0.6" >>>>>> pacemaker configuration: Version = 1.1.12 (Build: > 561c4cf)pacemakerがまだ古いということでしょうか。 >>>>>> >>>>>> 済みませんが、宜しくお願いします。 >>>>>> >>>>>> 以上 >>>>>> >>>>>> >>>>>> >>>>>> 2015年3月17日 14:59 <renay****@ybb*****>: >>>>>> >>>>>> 福田さん >>>>>>> >>>>>>> お疲れ様です。山内です。 >>>>>>> >>>>>>> ふと思ったのすが、以前のやり取りのメールで以下と回答してますが、問題ないでしょうか? >>>>>>> >>>>>>> >>>>>>>>>>>>> 2)Heartbeat3.0.6+Pacemaker最新 : > OK >>>>>>>>>>>>> >>>>>>>>>>>>> > どうやら、Heartbeatも最新版3.0.6を組合せる必要があるようです。 >>>>>>>>>>>>> > * http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/cceeb47a7d8f >>>>>>> >>>>>>> 以下のcrm_monのバージョンを見ると、1.1.12のようです。 >>>>>>> Heartbeat3.0.6と組み合わせるには、かなり新しめのPacemakerが必要です。 >>>>>>> >>>>>>>> # crm_mon -rfA >>>>>>>> >>>>>>>> Last updated: Tue Mar 17 14:14:39 2015 >>>>>>>> Last change: Tue Mar 17 14:01:43 2015 >>>>>>>> Stack: heartbeat >>>>>>>> Current DC: lbv2.beta.com > (82ffc36f-1ad8-8686-7db0-35686465c624) - parti >>>>>>>> tion with quorum >>>>>>>> Version: 1.1.12-561c4cf >>>>>>> >>>>>>> たぶん、以下の変更以降は少なくとも必要かと思います。 >>>>>>> >>>>>>> https://github.com/ClusterLabs/pacemaker/commit/f2302da063d08719d28367d8e362b8bfb0f85bf3 >>>>>>> >>>>>>> >>>>>>> >>>>>>> 以上です。 >>>>>>> >>>>>>> >>>>>>> >>>>>>> ----- Original Message ----- >>>>>>>> From: Masamichi Fukuda - elf-systems > <masamichi_fukud****@elf-s*****> >>>>>>>> To: 山内英生 <renay****@ybb*****>; > "linux****@lists*****" > <linux****@lists*****> >>>>>>> >>>>>>>> Date: 2015/3/17, Tue 14:38 >>>>>>>> Subject: Re: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて >>>>>>>> >>>>>>>> >>>>>>>> 山内さん >>>>>>>> >>>>>>>> お疲れ様です、福田です。 >>>>>>>> >>>>>>>> stonith-helperのシェバング行に-xを追加すれば良いのでしょうか? >>>>>>>> stonith-helperの先頭行を#!/bin/bash -xにしてクラスタを起動してみました。 >>>>>>>> >>>>>>>> crm_monでは先ほどと変わりはないようです。 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> # crm_mon -rfA >>>>>>>> >>>>>>>> Last updated: Tue Mar 17 14:14:39 2015 >>>>>>>> Last change: Tue Mar 17 14:01:43 2015 >>>>>>>> Stack: heartbeat >>>>>>>> Current DC: lbv2.beta.com > (82ffc36f-1ad8-8686-7db0-35686465c624) - parti >>>>>>>> tion with quorum >>>>>>>> Version: 1.1.12-561c4cf >>>>>>>> 2 Nodes configured >>>>>>>> 8 Resources configured >>>>>>>> >>>>>>>> Online: [ lbv1.beta.com lbv2.beta.com ] >>>>>>>> >>>>>>>> Full list of resources: >>>>>>>> >>>>>>>> Resource Group: HAvarnish >>>>>>>> vip_208 (ocf::heartbeat:IPaddr2): > Started lbv1.beta.com >>>>>>>> varnishd (lsb:varnish): Started > lbv1.beta.com >>>>>>>> Resource Group: grpStonith1 >>>>>>>> Stonith1-1 > (stonith:external/stonith-helper): Stopped >>>>>>>> Stonith1-2 (stonith:external/xen0): > Stopped >>>>>>>> Resource Group: grpStonith2 >>>>>>>> Stonith2-1 > (stonith:external/stonith-helper): Stopped >>>>>>>> Stonith2-2 (stonith:external/xen0): > Stopped >>>>>>>> Clone Set: clone_ping [ping] >>>>>>>> Started: [ lbv1.beta.com lbv2.beta.com ] >>>>>>>> >>>>>>>> Node Attributes: >>>>>>>> * Node lbv1.beta.com: >>>>>>>> + default_ping_set : 100 >>>>>>>> * Node lbv2.beta.com: >>>>>>>> + default_ping_set : 100 >>>>>>>> >>>>>>>> Migration summary: >>>>>>>> * Node lbv2.beta.com: >>>>>>>> Stonith1-1: migration-threshold=1 > fail-count=1000000 last-failure='Tue Mar 17 >>>>>>>> 14:12:16 2015' >>>>>>>> * Node lbv1.beta.com: >>>>>>>> Stonith2-1: migration-threshold=1 > fail-count=1000000 last-failure='Tue Mar 17 >>>>>>>> 14:12:21 2015' >>>>>>>> >>>>>>>> Failed actions: >>>>>>>> Stonith1-1_start_0 on lbv2.beta.com 'unknown > error' (1): call=31, st >>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:14 > 2015', queued=0ms, exec=1065ms >>>>>>>> Stonith2-1_start_0 on lbv1.beta.com 'unknown > error' (1): call=26, st >>>>>>>> atus=Error, last-rc-change='Tue Mar 17 14:12:19 > 2015', queued=0ms, exec=1081ms >>>>>>>> >>>>>>>> その他のログを探してみました。 >>>>>>>> >>>>>>>> heartbeat起動時です。 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> # less /var/log/pm_logconv.out >>>>>>>> Mar 17 14:11:28 lbv1.beta.com info: Starting > Heartbeat 3.0.6. >>>>>>>> Mar 17 14:11:33 lbv1.beta.com info: Link > lbv2.beta.com:eth1 is up. >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start > "ccm" process. (pid=13264) >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start > "lrmd" process. (pid=13267) >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start > "attrd" process. (pid=13268) >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start > "stonithd" process. (pid=13266) >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start > "cib" process. (pid=13265) >>>>>>>> Mar 17 14:11:34 lbv1.beta.com info: Start > "crmd" process. (pid=13269) >>>>>>>> >>>>>>>> >>>>>>>> # less /var/log/error >>>>>>>> Mar 17 14:12:20 lbv1 crmd[13269]: error: > process_lrm_event: Operation Stonith2-1_start_0 (node=lbv1.beta.com, call=26, > status=4, cib-update=19, confirmed=true) Error >>>>>>>> >>>>>>>> >>>>>>>> syslogからstonithをgrepしたものです >>>>>>>> >>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: > Starting child client > "/usr/local/heartbeat/libexec/pacemaker/stonithd" (0,0) >>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13266]: info: > Starting "/usr/local/heartbeat/libexec/pacemaker/stonithd" as uid 0 > gid 0 (pid 13266) >>>>>>>> Mar 17 14:11:34 lbv1 stonithd[13266]: notice: > crm_cluster_connect: Connecting to cluster infrastructure: heartbeat >>>>>>>> Mar 17 14:11:34 lbv1 heartbeat: [13255]: info: the > send queue length from heartbeat to client stonithd is set to 1024 >>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: notice: > setup_cib: Watching for stonith topology changes >>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: notice: > unpack_config: On loss of CCM Quorum: Ignore >>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: warning: > handle_startup_fencing: Blind faith: not fencing unseen nodes >>>>>>>> Mar 17 14:11:40 lbv1 stonithd[13266]: warning: > handle_startup_fencing: Blind faith: not fencing unseen nodes >>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]: notice: > stonith_device_register: Added 'Stonith2-1' to the device list (1 active > devices) >>>>>>>> Mar 17 14:11:41 lbv1 stonithd[13266]: notice: > stonith_device_register: Added 'Stonith2-2' to the device list (2 active > devices) >>>>>>>> Mar 17 14:12:04 lbv1 stonithd[13266]: notice: > xml_patch_version_check: Versions did not change in patch 0.5.0 >>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: notice: > log_operation: Operation 'monitor' [13386] for device > 'Stonith2-1' returned: -201 (Generic Pacemaker error) >>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning: > log_operation: Stonith2-1:13386 [ Performing: stonith -t external/stonith-helper > -S ] >>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning: > log_operation: Stonith2-1:13386 [ failed to exec "stonith" ] >>>>>>>> Mar 17 14:12:20 lbv1 stonithd[13266]: warning: > log_operation: Stonith2-1:13386 [ failed: 2 ] >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 宜しくお願いします。 >>>>>>>> >>>>>>>> 以上 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2015年3月17日 13:32 <renay****@ybb*****>: >>>>>>>> >>>>>>>> 福田さん >>>>>>>>> >>>>>>>>> お疲れ様です。山内です。 >>>>>>>>> >>>>>>>>> ということは、stonith-helperのstartに問題があるようですね。 >>>>>>>>> >>>>>>>>> stonith-helperの先頭に >>>>>>>>> >>>>>>>>> #!/bin/bash -x >>>>>>>>> >>>>>>>>> >>>>>>>>> を入れて、クラスタを起動すると何かわかるかも知れません。 >>>>>>>>> >>>>>>>>> ちなみに、stonith-helperのログもどこかに出ていると思うのですが。。。 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 以上です。 >>>>>>>>> >>>>>>>>> ----- Original Message ----- >>>>>>>>>> From: Masamichi Fukuda - elf-systems > <masamichi_fukud****@elf-s*****> >>>>>>>>>> To: 山内英生 <renay****@ybb*****>; > "linux****@lists*****" > <linux****@lists*****> >>>>>>>>> >>>>>>>>>> Date: 2015/3/17, Tue 12:31 >>>>>>>>>> Subject: Re: [Linux-ha-jp] > スプリットブレイン時のSTONITHエラーについて >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 山内さん >>>>>>>>>> cc:松島さん >>>>>>>>>> >>>>>>>>>> こんにちは、福田です。 >>>>>>>>>> >>>>>>>>>> 同じディレクトリにxen0はありました。 >>>>>>>>>> >>>>>>>>>> # pwd >>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external >>>>>>>>>> >>>>>>>>>> # ls >>>>>>>>>> drac5 ibmrsa kdumpcheck > riloe vmware >>>>>>>>>> dracmc-telnet ibmrsa-telnet libvirt > ssh xen0 >>>>>>>>>> hetzner ipmi nut > stonith-helper xen0-ha >>>>>>>>>> hmchttp ippower9258 rackpdu > vcenter >>>>>>>>>> >>>>>>>>>> 宜しくお願いします。 >>>>>>>>>> >>>>>>>>>> 以上 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2015-03-17 10:53 GMT+09:00 > <renay****@ybb*****>: >>>>>>>>>> >>>>>>>>>> 福田さん >>>>>>>>>>> cc:松島さん >>>>>>>>>>> >>>>>>>>>>> お疲れ様です。山内です。 >>>>>>>>>>> >>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。 >>>>>>>>>>>> >>>>>>>>>>>> stonith-helperがおかしいのでしょうか。 >>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。 >>>>>>>>>>>> stonith-helperはここに配置されています。 >>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper >>>>>>>>>>> >>>>>>>>>>> このディレクトリにxen0もありますか? >>>>>>>>>>> 無いようでしたら、問題がありますので、一度、stonith-helperのファイルを属性などはそのまま、xen0と同じディレクトリに >>>>>>>>>>> コピーしてみてください。 >>>>>>>>>>> >>>>>>>>>>> それで稼働するなら、pm_extrasのインストールに問題があるということになります。 >>>>>>>>>>> >>>>>>>>>>> 以上です。 >>>>>>>>>>> >>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>> From: Masamichi Fukuda - elf-systems > <masamichi_fukud****@elf-s*****> >>>>>>>>>>>> To: 山内英生 > <renay****@ybb*****>; > "linux****@lists*****" > <linux****@lists*****> >>>>>>>>>>> >>>>>>>>>>>> Date: 2015/3/17, Tue 10:31 >>>>>>>>>>>> Subject: Re: [Linux-ha-jp] > スプリットブレイン時のSTONITHエラーについて >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 山内さん >>>>>>>>>>>> cc:松島さん >>>>>>>>>>>> >>>>>>>>>>>> おはようございます、福田です。 >>>>>>>>>>>> crmの例をありがとうございます。 >>>>>>>>>>>> >>>>>>>>>>>> 早速、こちらの環境に合わせてみました。 >>>>>>>>>>>> >>>>>>>>>>>> $ cat test.crm >>>>>>>>>>>> ### Cluster Option ### >>>>>>>>>>>> property \ >>>>>>>>>>>> > no-quorum-policy="ignore" \ >>>>>>>>>>>> stonith-enabled="true" > \ >>>>>>>>>>>> > startup-fencing="false" \ >>>>>>>>>>>> stonith-timeout="710s" > \ >>>>>>>>>>>> > crmd-transition-delay="2s" >>>>>>>>>>>> >>>>>>>>>>>> ### Resource Default ### >>>>>>>>>>>> rsc_defaults \ >>>>>>>>>>>> > resource-stickiness="INFINITY" \ >>>>>>>>>>>> > migration-threshold="1" >>>>>>>>>>>> >>>>>>>>>>>> ### Group Configuration ### >>>>>>>>>>>> group HAvarnish \ >>>>>>>>>>>> vip_208 \ >>>>>>>>>>>> varnishd >>>>>>>>>>>> >>>>>>>>>>>> group grpStonith1 \ >>>>>>>>>>>> Stonith1-1 \ >>>>>>>>>>>> Stonith1-2 >>>>>>>>>>>> >>>>>>>>>>>> group grpStonith2 \ >>>>>>>>>>>> Stonith2-1 \ >>>>>>>>>>>> Stonith2-2 >>>>>>>>>>>> >>>>>>>>>>>> ### Clone Configuration ### >>>>>>>>>>>> clone clone_ping \ >>>>>>>>>>>> ping >>>>>>>>>>>> >>>>>>>>>>>> ### Fencing Topology ### >>>>>>>>>>>> fencing_topology \ >>>>>>>>>>>> lbv1.beta.com: Stonith1-1 > Stonith1-2 \ >>>>>>>>>>>> lbv2.beta.com: Stonith2-1 > Stonith2-2 >>>>>>>>>>>> >>>>>>>>>>>> ### Primitive Configuration ### >>>>>>>>>>>> primitive vip_208 > ocf:heartbeat:IPaddr2 \ >>>>>>>>>>>> params \ >>>>>>>>>>>> > ip="192.168.17.208" \ >>>>>>>>>>>> nic="eth0" \ >>>>>>>>>>>> cidr_netmask="24" > \ >>>>>>>>>>>> op start interval="0s" > timeout="90s" on-fail="restart" \ >>>>>>>>>>>> op monitor > interval="5s" timeout="60s" on-fail="restart" > \ >>>>>>>>>>>> op stop interval="0s" > timeout="100s" on-fail="fence" >>>>>>>>>>>> >>>>>>>>>>>> primitive varnishd lsb:varnish \ >>>>>>>>>>>> op start interval="0s" > timeout="90s" on-fail="restart" \ >>>>>>>>>>>> op monitor > interval="10s" timeout="60s" on-fail="restart" > \ >>>>>>>>>>>> op stop interval="0s" > timeout="100s" on-fail="fence" >>>>>>>>>>>> >>>>>>>>>>>> primitive ping ocf:pacemaker:ping > \ >>>>>>>>>>>> params \ >>>>>>>>>>>> > name="default_ping_set" \ >>>>>>>>>>>> > host_list="192.168.17.254" \ >>>>>>>>>>>> multiplier="100" > \ >>>>>>>>>>>> dampen="1" \ >>>>>>>>>>>> op start interval="0s" > timeout="90s" on-fail="restart" \ >>>>>>>>>>>> op monitor > interval="10s" timeout="60s" on-fail="restart" > \ >>>>>>>>>>>> op stop interval="0s" > timeout="100s" on-fail="fence" >>>>>>>>>>>> >>>>>>>>>>>> primitive Stonith1-1 > stonith:external/stonith-helper \ >>>>>>>>>>>> params \ >>>>>>>>>>>> > pcmk_reboot_retries="1" \ >>>>>>>>>>>> > pcmk_reboot_timeout="40s" \ >>>>>>>>>>>> > hostlist="lbv1.beta.com" \ >>>>>>>>>>>> > dead_check_target="192.168.17.132 10.0.17.132" \ >>>>>>>>>>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep > -q `hostname`" \ >>>>>>>>>>>> > run_online_check="yes" \ >>>>>>>>>>>> op start interval="0s" > timeout="60s" on-fail="restart" \ >>>>>>>>>>>> op stop interval="0s" > timeout="60s" on-fail="ignore" >>>>>>>>>>>> >>>>>>>>>>>> primitive Stonith1-2 > stonith:external/xen0 \ >>>>>>>>>>>> params \ >>>>>>>>>>>> > pcmk_reboot_timeout="60s" \ >>>>>>>>>>>> > hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \ >>>>>>>>>>>> > dom0="xen0.beta.com" \ >>>>>>>>>>>> op start interval="0s" > timeout="60s" on-fail="restart" \ >>>>>>>>>>>> op monitor > interval="3600s" timeout="60s" on-fail="restart" > \ >>>>>>>>>>>> op stop interval="0s" > timeout="60s" on-fail="ignore" >>>>>>>>>>>> >>>>>>>>>>>> primitive Stonith2-1 > stonith:external/stonith-helper \ >>>>>>>>>>>> params \ >>>>>>>>>>>> > pcmk_reboot_retries="1" \ >>>>>>>>>>>> > pcmk_reboot_timeout="40s" \ >>>>>>>>>>>> > hostlist="lbv2.beta.com" \ >>>>>>>>>>>> > dead_check_target="192.168.17.133 10.0.17.133" \ >>>>>>>>>>>> > standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep > -q `hostname`" \ >>>>>>>>>>>> > run_online_check="yes" \ >>>>>>>>>>>> op start interval="0s" > timeout="60s" on-fail="restart" \ >>>>>>>>>>>> op stop interval="0s" > timeout="60s" on-fail="ignore" >>>>>>>>>>>> >>>>>>>>>>>> primitive Stonith2-2 > stonith:external/xen0 \ >>>>>>>>>>>> params \ >>>>>>>>>>>> > pcmk_reboot_timeout="60s" \ >>>>>>>>>>>> > hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \ >>>>>>>>>>>> > dom0="xen0.beta.com" \ >>>>>>>>>>>> op start interval="0s" > timeout="60s" on-fail="restart" \ >>>>>>>>>>>> op monitor > interval="3600s" timeout="60s" on-fail="restart" > \ >>>>>>>>>>>> op stop interval="0s" > timeout="60s" on-fail="ignore" >>>>>>>>>>>> >>>>>>>>>>>> ### Resource Location ### >>>>>>>>>>>> location HA_location-1 HAvarnish > \ >>>>>>>>>>>> rule 200: #uname eq > lbv1.beta.com \ >>>>>>>>>>>> rule 100: #uname eq > lbv2.beta.com >>>>>>>>>>>> >>>>>>>>>>>> location HA_location-2 HAvarnish > \ >>>>>>>>>>>> rule -INFINITY: not_defined > default_ping_set or default_ping_set lt 100 >>>>>>>>>>>> >>>>>>>>>>>> location HA_location-3 grpStonith1 > \ >>>>>>>>>>>> rule -INFINITY: #uname eq > lbv1.beta.com >>>>>>>>>>>> >>>>>>>>>>>> location HA_location-4 grpStonith2 > \ >>>>>>>>>>>> rule -INFINITY: #uname eq > lbv2.beta.com >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> これを流しこんだところ、昨日とはメッセージが異なります。 >>>>>>>>>>>> pingのメッセージはなくなっていました。 >>>>>>>>>>>> >>>>>>>>>>>> # crm_mon -rfA >>>>>>>>>>>> Last updated: Tue Mar 17 10:21:28 > 2015 >>>>>>>>>>>> Last change: Tue Mar 17 10:21:09 > 2015 >>>>>>>>>>>> Stack: heartbeat >>>>>>>>>>>> Current DC: lbv2.beta.com > (82ffc36f-1ad8-8686-7db0-35686465c624) - parti >>>>>>>>>>>> tion with quorum >>>>>>>>>>>> Version: 1.1.12-561c4cf >>>>>>>>>>>> 2 Nodes configured >>>>>>>>>>>> 8 Resources configured >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Online: [ lbv1.beta.com > lbv2.beta.com ] >>>>>>>>>>>> >>>>>>>>>>>> Full list of resources: >>>>>>>>>>>> >>>>>>>>>>>> Resource Group: HAvarnish >>>>>>>>>>>> vip_208 > (ocf::heartbeat:IPaddr2): Started lbv1.beta.com >>>>>>>>>>>> varnishd (lsb:varnish): > Started lbv1.beta.com >>>>>>>>>>>> Resource Group: grpStonith1 >>>>>>>>>>>> Stonith1-1 > (stonith:external/stonith-helper): Stopped >>>>>>>>>>>> Stonith1-2 > (stonith:external/xen0): Stopped >>>>>>>>>>>> Resource Group: grpStonith2 >>>>>>>>>>>> Stonith2-1 > (stonith:external/stonith-helper): Stopped >>>>>>>>>>>> Stonith2-2 > (stonith:external/xen0): Stopped >>>>>>>>>>>> Clone Set: clone_ping [ping] >>>>>>>>>>>> Started: [ lbv1.beta.com > lbv2.beta.com ] >>>>>>>>>>>> >>>>>>>>>>>> Node Attributes: >>>>>>>>>>>> * Node lbv1.beta.com: >>>>>>>>>>>> + > default_ping_set : 100 >>>>>>>>>>>> * Node lbv2.beta.com: >>>>>>>>>>>> + > default_ping_set : 100 >>>>>>>>>>>> >>>>>>>>>>>> Migration summary: >>>>>>>>>>>> * Node lbv2.beta.com: >>>>>>>>>>>> Stonith1-1: migration-threshold=1 > fail-count=1000000 last-failure='Tue Mar 17 >>>>>>>>>>>> 10:21:17 2015' >>>>>>>>>>>> * Node lbv1.beta.com: >>>>>>>>>>>> Stonith2-1: migration-threshold=1 > fail-count=1000000 last-failure='Tue Mar 17 >>>>>>>>>>>> 10:21:17 2015' >>>>>>>>>>>> >>>>>>>>>>>> Failed actions: >>>>>>>>>>>> Stonith1-1_start_0 on > lbv2.beta.com 'unknown error' (1): call=31, st >>>>>>>>>>>> atus=Error, last-rc-change='Tue > Mar 17 10:21:15 2015', queued=0ms, exec=1082ms >>>>>>>>>>>> Stonith2-1_start_0 on > lbv1.beta.com 'unknown error' (1): call=31, st >>>>>>>>>>>> atus=Error, last-rc-change='Tue > Mar 17 10:21:16 2015', queued=0ms, exec=1079ms >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> /var/log/ha-debugのログです。 >>>>>>>>>>>> >>>>>>>>>>>> IPaddr2(vip_208)[7851]: > 2015/03/17_10:21:22 INFO: Adding inet address 192.168.17.208/24 with broadcast > address 192.168.17.255 to device eth0 >>>>>>>>>>>> IPaddr2(vip_208)[7851]: > 2015/03/17_10:21:22 INFO: Bringing device eth0 up >>>>>>>>>>>> IPaddr2(vip_208)[7851]: > 2015/03/17_10:21:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p > /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto > not_used not_used >>>>>>>>>>>> >>>>>>>>>>>> 標準出力や標準エラー出力はありませんでした。 >>>>>>>>>>>> >>>>>>>>>>>> stonith-helperがおかしいのでしょうか。 >>>>>>>>>>>> stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。 >>>>>>>>>>>> stonith-helperはここに配置されています。 >>>>>>>>>>>> /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 宜しくお願いします。 >>>>>>>>>>>> >>>>>>>>>>>> 以上 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2015-03-17 9:45 GMT+09:00 > <renay****@ybb*****>: >>>>>>>>>>>> >>>>>>>>>>>> 福田さん >>>>>>>>>>>>> >>>>>>>>>>>>> おはようございます。山内です。 >>>>>>>>>>>>> >>>>>>>>>>>>> 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。 >>>>>>>>>>>>> (実際には、改行に気を付けてください) >>>>>>>>>>>>> >>>>>>>>>>>>> 以下の例は、PM1.1系での設定で、 >>>>>>>>>>>>> nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。 >>>>>>>>>>>>> nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。 >>>>>>>>>>>>> >>>>>>>>>>>>> stonith自体は、helperとsshです。 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> (snip) >>>>>>>>>>>>> ### Group Configuration ### >>>>>>>>>>>>> group grpStonith1 \ >>>>>>>>>>>>> prmStonith1-1 \ >>>>>>>>>>>>> prmStonith1-2 >>>>>>>>>>>>> >>>>>>>>>>>>> group grpStonith2 \ >>>>>>>>>>>>> prmStonith2-1 \ >>>>>>>>>>>>> prmStonith2-2 >>>>>>>>>>>>> >>>>>>>>>>>>> ### Fencing Topology ### >>>>>>>>>>>>> fencing_topology \ >>>>>>>>>>>>> nodea: prmStonith1-1 > prmStonith1-2 \ >>>>>>>>>>>>> nodeb: prmStonith2-1 > prmStonith2-2 >>>>>>>>>>>>> (snp) >>>>>>>>>>>>> primitive prmStonith1-1 > stonith:external/stonith-helper \ >>>>>>>>>>>>> params \ >>>>>>>>>>>>> >>>>>>>>>>>>> pcmk_reboot_retries="1" > \ >>>>>>>>>>>>> pcmk_reboot_timeout="40s" > \ >>>>>>>>>>>>> hostlist="nodea" \ >>>>>>>>>>>>> dead_check_target="192.168.28.60 > 192.168.28.70" \ >>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource > -r prmRES -W | grep -qi `hostname`" \ >>>>>>>>>>>>> run_online_check="yes" > \ >>>>>>>>>>>>> op start interval="0s" > timeout="60s" on-fail="restart" \ >>>>>>>>>>>>> op stop interval="0s" > timeout="60s" on-fail="ignore" >>>>>>>>>>>>> >>>>>>>>>>>>> primitive prmStonith1-2 > stonith:external/ssh \ >>>>>>>>>>>>> params \ >>>>>>>>>>>>> pcmk_reboot_timeout="60s" > \ >>>>>>>>>>>>> hostlist="nodea" \ >>>>>>>>>>>>> op start interval="0s" > timeout="60s" on-fail="restart" \ >>>>>>>>>>>>> op monitor > interval="3600s" timeout="60s" on-fail="restart" > \ >>>>>>>>>>>>> op stop interval="0s" > timeout="60s" on-fail="ignore" >>>>>>>>>>>>> >>>>>>>>>>>>> primitive prmStonith2-1 > stonith:external/stonith-helper \ >>>>>>>>>>>>> params \ >>>>>>>>>>>>> pcmk_reboot_retries="1" > \ >>>>>>>>>>>>> pcmk_reboot_timeout="40s" > \ >>>>>>>>>>>>> hostlist="nodeb" \ >>>>>>>>>>>>> dead_check_target="192.168.28.61 > 192.168.28.71" \ >>>>>>>>>>>>> standby_check_command="/usr/sbin/crm_resource > -r prmRES -W | grep -qi `hostname`" \ >>>>>>>>>>>>> run_online_check="yes" > \ >>>>>>>>>>>>> op start interval="0s" > timeout="60s" on-fail="restart" \ >>>>>>>>>>>>> op stop interval="0s" > timeout="60s" on-fail="ignore" >>>>>>>>>>>>> >>>>>>>>>>>>> primitive prmStonith2-2 > stonith:external/ssh \ >>>>>>>>>>>>> params \ >>>>>>>>>>>>> pcmk_reboot_timeout="60s" > \ >>>>>>>>>>>>> hostlist="nodeb" \ >>>>>>>>>>>>> op start interval="0s" > timeout="60s" on-fail="restart" \ >>>>>>>>>>>>> op monitor > interval="3600s" timeout="60s" on-fail="restart" > \ >>>>>>>>>>>>> op stop interval="0s" > timeout="60s" on-fail="ignore" >>>>>>>>>>>>> (snip) >>>>>>>>>>>>> location > rsc_location-grpStonith1-2 grpStonith1 \ >>>>>>>>>>>>> rule -INFINITY: #uname eq nodea >>>>>>>>>>>>> location > rsc_location-grpStonith2-3 grpStonith2 \ >>>>>>>>>>>>> rule -INFINITY: #uname eq nodeb >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 以上です。 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> ELF Systems >>>>>>>>>>>> Masamichi Fukuda >>>>>>>>>>>> mail to: > masamichi_fukud****@elf-s***** >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Linux-ha-japan mailing list >>>>>>>>>>> Linux****@lists***** >>>>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> ELF Systems >>>>>>>>>> Masamichi Fukuda >>>>>>>>>> mail to: masamichi_fukud****@elf-s***** >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Linux-ha-japan mailing list >>>>>>>>> Linux****@lists***** >>>>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> ELF Systems >>>>>>>> Masamichi Fukuda >>>>>>>> mail to: masamichi_fukud****@elf-s***** >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Linux-ha-japan mailing list >>>>>>> Linux****@lists***** >>>>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> ELF Systems >>>>>> Masamichi Fukuda >>>>>> mail to: masamichi_fukud****@elf-s***** >>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Linux-ha-japan mailing list >>>>> Linux****@lists***** >>>>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan >>>>> >>>> >>>> -- >>>> >>>> ELF Systems >>>> Masamichi Fukuda >>>> mail to: masamichi_fukud****@elf-s***** >>>> >>>> >>>> >>> >>> _______________________________________________ >>> Linux-ha-japan mailing list >>> Linux****@lists***** >>> http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan >>> >> >> >> -- >> >> ELF Systems >> Masamichi Fukuda >> mail to: masamichi_fukud****@elf-s***** >> >> > > _______________________________________________ > Linux-ha-japan mailing list > Linux****@lists***** > http://lists.sourceforge.jp/mailman/listinfo/linux-ha-japan >