Hi,
Could you please help us to find out the root cause analysis on why orinoco1 server rebooted (CRS node eviction).

below is the error message

Jun 27 17:51:59 orinoco1 logger: Oracle CSSD failure 134.
Jun 27 17:51:59 orinoco1 logger: Oracle CRS failure. Rebooting for cluster integrity.
Jun 27 17:52:00 orinoco1 logger: Oracle clsomon failed with fatal status 12.
Jun 27 17:52:00 orinoco1 logger: Oracle CRS failure. Rebooting for cluster integrity.



====== OCCSD LOG ==================================================================================================== ========
[ CSSD]2011-06-27 17:43:21.978 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x7ac510) proc(0x777f60) pid() proto(10:2:1:1)
[ CSSD]2011-06-27 17:43:45.328 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x7af170) proc(0x773c10) pid() proto(10:2:1:1)
[ CSSD]2011-06-27 17:44:45.678 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x7af170) proc(0x773c10) pid() proto(10:2:1:1)
[ CSSD]2011-06-27 17:45:02.940 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x7af170) proc(0x773c10) pid(11998) proto(10:2:1:1)
[ CSSD]2011-06-27 17:45:16.233 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x7a2d80) proc(0x77a900) pid(12822) proto(10:2:1:1)
[ CSSD]2011-06-27 17:45:45.970 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x77abf0) proc(0x777e60) pid() proto(10:2:1:1)
[ CSSD]2011-06-27 17:46:46.330 [1199618400] >TRACE: clssgmClientConnectMsg: Connect from con(0x77abf0) proc(0x777e60) pid() proto(10:2:1:1)
[ CSSD]2011-06-27 17:50:21.821 [1241577824] >WARNING: clssnmPollingThread: node orinoco2 (2) at 50% heartbeat fatal, eviction in 29.560 seconds
[ CSSD]2011-06-27 17:50:22.823 [1241577824] >WARNING: clssnmPollingThread: node orinoco2 (2) at 50% heartbeat fatal, eviction in 28.560 seconds
[ CSSD]2011-06-27 17:50:36.831 [1241577824] >WARNING: clssnmPollingThread: node orinoco2 (2) at 75% heartbeat fatal, eviction in 14.550 seconds
[ CSSD]2011-06-27 17:50:37.823 [1241577824] >WARNING: clssnmPollingThread: node orinoco2 (2) at 75% heartbeat fatal, eviction in 13.560 seconds
[ CSSD]2011-06-27 17:50:45.829 [1241577824] >WARNING: clssnmPollingThread: node orinoco2 (2) at 90% heartbeat fatal, eviction in 5.560 seconds
[ CSSD]2011-06-27 17:50:46.831 [1241577824] >WARNING: clssnmPollingThread: node orinoco2 (2) at 90% heartbeat fatal, eviction in 4.560 seconds
[ CSSD]2011-06-27 17:50:47.833 [1241577824] >TRACE: clssnmPollingThread: node orinoco2 (2) is impending reconfig
[ CSSD]2011-06-27 17:50:47.833 [1241577824] >WARNING: clssnmPollingThread: node orinoco2 (2) at 90% heartbeat fatal, eviction in 3.550 seconds
[ CSSD]2011-06-27 17:50:48.825 [1241577824] >TRACE: clssnmPollingThread: node orinoco2 (2) is impending reconfig
[ CSSD]2011-06-27 17:50:48.825 [1241577824] >WARNING: clssnmPollingThread: node orinoco2 (2) at 90% heartbeat fatal, eviction in 2.560 seconds
[ CSSD]2011-06-27 17:50:49.827 [1241577824] >TRACE: clssnmPollingThread: node orinoco2 (2) is impending reconfig
[ CSSD]2011-06-27 17:50:49.827 [1241577824] >WARNING: clssnmPollingThread: node orinoco2 (2) at 90% heartbeat fatal, eviction in 1.560 seconds
[ CSSD]2011-06-27 17:50:50.829 [1241577824] >TRACE: clssnmPollingThread: node orinoco2 (2) is impending reconfig
[ CSSD]2011-06-27 17:50:50.829 [1241577824] >WARNING: clssnmPollingThread: node orinoco2 (2) at 90% heartbeat fatal, eviction in 0.560 seconds
==================================================================================================== ==========================


====== /var/log/messages ==================================================================================================== ========
Jun 27 17:45:01 orinoco1 su(pam_unix)[11911]: session opened for user oracle by (uid=0)
Jun 27 17:45:01 orinoco1 su(pam_unix)[11911]: session closed for user oracle
Jun 27 17:47:40 orinoco1 kernel: bnx2: eth0 NIC Link is Down
Jun 27 17:47:41 orinoco1 kernel: LLT INFO V-14-1-10205 link 2 (eth0) node 0 in trouble
Jun 27 17:47:41 orinoco1 kernel: LLT INFO V-14-1-10205 link 2 (eth0) node 2 in trouble
Jun 27 17:47:41 orinoco1 kernel: LLT INFO V-14-1-10205 link 2 (eth0) node 5 in trouble
Jun 27 17:47:41 orinoco1 kernel: LLT INFO V-14-1-10205 link 2 (eth0) node 1 in trouble
Jun 27 17:47:41 orinoco1 kernel: LLT INFO V-14-1-10205 link 2 (eth0) node 3 in trouble
Jun 27 17:47:43 orinoco1 kernel: bnx2: eth0 NIC Link is Up, 1000 Mbps full duplex
Jun 27 17:47:44 orinoco1 kernel: LLT INFO V-14-1-10024 link 2 (eth0) node 0 active
Jun 27 17:47:44 orinoco1 kernel: LLT INFO V-14-1-10024 link 2 (eth0) node 2 active
Jun 27 17:47:44 orinoco1 kernel: LLT INFO V-14-1-10024 link 2 (eth0) node 5 active
Jun 27 17:47:44 orinoco1 kernel: LLT INFO V-14-1-10024 link 2 (eth0) node 1 active
Jun 27 17:47:44 orinoco1 kernel: LLT INFO V-14-1-10024 link 2 (eth0) node 3 active
Jun 27 17:47:45 orinoco1 kernel: o2net: connection to node orinoco2 (num 1) at 199.40.40.234:7777 has been idle for 10.0 seconds, shutting it down.
Jun 27 17:47:45 orinoco1 kernel: (0,0)2net_idle_timer:1426 here are some times that might help debug the situation: (tmr 1309168055.597322 now 1309168065.596662 dr 1309168055.597308 adv 1309168055.597329:1309168055.597330 func (d5542a8e:504) 1309168035.598570:1309168035.598693)
Jun 27 17:47:45 orinoco1 kernel: o2net: no longer connected to node orinoco2 (num 1) at 199.40.40.234:7777
==================================================================================================== ==========================


From the above messages we confirmed that, server has been rebooted to keep cluster integrity due to network interface failure logged in /var/log/mesages...

But can somebody confirm if this is due to :-
i) private interconnect network failure
or
ii) vote disk issue

Please also confirm that this is not due to glibc bug which causes random eviction. Note that O/S is running on Red Hat Enterprise Linux AS release 4 (Nahant Update 4) with 2.6.9-42.ELsmp.

Glibc : glibc-2.3.4-2.25


thanks