RAC node reboot for no apparent reason
I saw a strange RAC behaviour yesterday afternoon wherein one node of a two-node cluster just rebooted for no apparent reason. Nothing appears in /var/log/messages around the time server restarted. However, "fsck" was run upon restart. Would this be a system crash due to some hardware error? If so, how can I find out the reason?
I looked up Oracle logs (starting from alert.log) and could not make out anything except regular messages of startup sequences. O2CB is configured with a heartbeat of 60.
What is even more strange is that once the first instance detected unavailability of node2, the cluster did not shut down, but continued normally. Obviously, when node2 was back, node1 promptly did instance recovery and everything was back to normal.
Can anyone guide me how to locate or determine the reason behind node2 shutdown and also why clusterware did not shutdown cluster (read surviving instance)?
OS: RHES Linux 4 U4, x86_64
Oracle: 10.2.0.3, x86_64
Any further information would be gladly given (obviously :) )
Thanks in advance and regards,