We have two boxes located at two sites set aside for Grid control.
At the moment, we are running one as live, monitoring all the targets. The DR box is also up and running and in the event of a DR situation we will recongigure our 10G agents to point to the DR Grid box. DR monitors the live box and vice versa.
This poses some difficulties; any jobs etc, historical data will be lost in the event of DR.
As grid just uses an oracle database as its repository we are also considering using a standby database instead. in the event of DR, activate the database, possibly rename the box, start the grid coimponents and fingers crossed the agents will start talking to the DR box.
My concern is the agents will not fail over cleanly. Also, we have nothing monitoring whether the live box goes down, and would have to write a script to do this.
Does anyone have a similar setup or any advice? The documentation for Grid is a little thin.
Grid Control, soon to be renamed yet again to "10g OEM", has it's place, but I'm not entirely happy with it.
I don't trust OEM's job control system. cron and shell/sql scripts are much more portable and a snap to recover. If you prefer "next", "next", "finish" .. be my guest.
Grid's alerts are really slow for critical up/down system notifications. The agent has to post it's metrics before Grid Control forwards the alert. I'd recommend SiteScope.
On the other hand, 10g OEM is good for database object and ASM management. However, there's a ton of stuff you can't manage with the HTML version. You'll need to get the java version to manage Advanced Queues, OLAP, and a bunch of other components.
I haven't had success doing a DR with the repository. I can recover the db objects, but it had problems reconnecting a different server to the recovered repository. Best of luck with this one.
thanks for the link, I'll check it out...missed that one.
I've not found the system up/down notifications too bad, it can however, take up to 5 minutes, it depends how critical that could be.
We've found a whole host of bugs and issues with release one, the job system isn't up to scratch (its worse than 9i) and notification formats are not configurable. Oracle won't give a firm date for release 2 yet either...
The agents on the other hand, are the most stable yet...and with a bit of frigging, you can monitor a RAC setup properly.
Always surprised at how few people use OEM/Grid though.