Shafiulla Syed's Technical Blog.: Oracle RAC node unavailable with error: Server unexpectedly closed network connection6]clsc_connect: (0x251c670) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_node2

Early midnight I received a call from the monitoring team that one of the critical production database node is not available.

As I am aware that this DC has power issues most of the time, I expected this will be ok when all gets up after power is on. But still, the problem continued with frequent node evictions.

The Oracle Cluster is up in only one node and the other node yet facing the issue.

In my initial validation, I make sure all the shared storage is up, time is in sync, then further Using below output quickly found that the issue is with Cluster interconnect communication issue.

oracle@node1 ~]$ ps -ef| egrep 'crsd.bin|ocssd.bin|evmd.bin' | grep -v grep

oracle 11815 11809 0 12:20 ? 00:00:00 /u01/app/crs/bin/evmd.bin

root 11929 10953 3 12:20 ? 00:01:07 /u01/app/crs/bin/crsd.bin reboot

oracle 12641 12148 0 12:20 ? 00:00:06 /u01/app/crs/bin/ocssd.bin

[oracle@node2 ~]$ ps -ef| egrep 'crsd.bin|ocssd.bin|evmd.bin' | grep -v grep

oracle 11508 11506 0 12:31 ? 00:00:00 /u01/app/crs/bin/evmd.bin

root 11661 10700 3 12:31 ? 00:00:47 /u01/app/crs/bin/crsd.bin reboot

To make my anticipation true the below alert was also pointing to same issue which is indirectly related to the Interconnectivity issue.

vi /u01/app/crs/log/node2/crsd/crsd.log

Server unexpectedly closed network connection6]clsc_connect: (0x251c670) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_node2_))

2020-08-30 09:52:42.975: [ CSSCLNT][2274735840]clsssInitNative: connect failed, rc 9

2020-08-30 09:52:42.975: [ CRSRTI][2274735840]0CSS is not ready. Received status 3 from CSS. Waiting for good status ..

[oracle@node1 ~]$ ping node2-priv
PING node2-priv..cstt.gov (172.16.0.2) 56(84) bytes of data.
From node1-priv..cstt.gov (172.16.0.1) icmp_seq=2 Destination Host Unreachable
From node1-priv..cstt.gov (172.16.0.1) icmp_seq=3 Destination Host Unreachable
From node1-priv..cstt.gov (172.16.0.1) icmp_seq=4 Destination Host Unreachable

Solution:

The break between cluster interconnectivity is the culprit here. Solving the private network issue would resolve the problem.

In our case, the Private ethernet cards are up and active on both nodes but unable to communicate via private IPs.

Seems strange for us but when the physical inspection was done in DC it was found the problem in Network cable and a physical port.

As soon as the physical issue is resolved we have rebooted the server which came up successfully. Hence the problem is resolved.

Shafiulla Syed's Technical Blog.

Sunday, August 30, 2020

Oracle RAC node unavailable with error: Server unexpectedly closed network connection6]clsc_connect: (0x251c670) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_node2_))

1 comment:

Oracle RAC node unavailable with error: Server unexpectedly closed network connection6]clsc_connect: (0x251c670) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_node2_))

Search This Blog