It’s the Bloody Defaults
If you’re a DBA having problems with Oracle Data Guard dropping connections for no apparent reason, the first point of call is usually the Network Team.
Especially if you’re seeing errors like this:
“RFS[7]: Possible network disconnect with primary database
Dataguard log shipping is failing”
The Network Team will hopefully do all the investigations they can. Checking:
- ICMP ping repsonses
- ICMP trace routes
- Routing
- Interface speed/duplex mismatches on the hosts, switches, routers, firewall
They’ll then hopefully start looking at tcpdumps or packet captures from switches.
Then they’ll scratch their heads and go, “I can’t see anything wrong…”
Both teams will then start hypothesising on more convoluted possibilities. Is it an MTU issues? Is the traffic going over the Internet / VPN and it’s ‘just the way it is’?
…No
…We’re all guilty.
…We should all be fired.
The first thing to ask yourself is this, “Does the traffic pass through a Firewall?”
Then ask, “Is it a Cisco Firewall?”
If the answer is yes to both, then the place to start looking is Application Inspection. By default, both the Cisco PIX and Cisco ASA have SQL*Net Application Inspection turned on. This is great for SQL*Net traffic, but the problem is that Oracle Data Guard uses the same TCP Port – 1521
So that means that Data Guard traffic is also subject to the same Inspection and time and time again, this causes issues.
Look for this in your PIX 6.x configs:
fixup protocol sqlnet 1521
Or this in your PIX/ASA 7.x or 8.x configs:
policy-map global_policy
class inspection_default
inspect sqlnet
If you see either of these, the raise the red flag.
Now, you could disable the Inspection, but this is usually a global change. So the simple answer:
Always run Oracle Data Guard on a different TCP port
Life should then be good again.
Sometimes I really hate the Cisco defaults…