Monitoring Load Balanced Systems with ITM 6

There are cases where you use load balancers where you might want to use ITM logical views to monitor whether a particular service is either running or not. For example in the view below we want to know whether the Web Servers service is up (green) or down (red).

service view

Specifically there could be a Critical event on a load balanced system that does not cause the application itself to be unavailable as the other load balanced system will still be working. For example suppose we want to monitor our Web Servers that use 2 servers called scooby and fibademo. These systems will be monitored for standard things like IIS, the Operating System and DB2 events of which a number of Critical events might cause the application to be down but a single event on either of these systems will not cause the failure.

Situation Comparisons

To do this I will create a single Situation Comparison called WebServers.

sit1

Click on the situation comparsion box and chosse all the situations you deem critical to the application running. For this example I have chosen:

  1. MS_Offline
  2. UDB_Status_Warning
  3. NT_Paging_File_Critical

sit2

Next click on the boxes beneath all of the situations you have chosen to compare against and ensure they are set to true.

Ensure that you have set situation test to TRUE on seperate lines as clicking all on the same line has a different meaning:

For example the following selection means this situation should fire if any of MS_Offline, UDB_Status_Warning, and NT_Paging_File_Critical fire. In other words it is an OR comparison.

or comparison2

Whereas in this example the situtaion will fire if MS_Offline, UDB_Status_Warning, and NT_Paging_File_Critical all fire together. In other words it is an AND comparison.

and comparison2

In our case we want to use the first choice (OR) so that it will fire on any situation.

Lastly choose the interval and distribute the new situation to both our servers: scooby and fibademo.

sit3

Correlate Situations Across Managed Systems

Next we need to create a situation that will check whether WebServers has fired on both or our systems, in other words are both our web servers unavailable. This is done by creating a situation that will Correlate Situations Across Managed Systems and assign it to our logical view.

To do this we must assign a system against the logical item we are using (Web Servers). Right click on the logical item and select Properties and select *HUB from the Available Managed System Lists. Click OK.

properties

Now we can create our situation. Right click on the logical item again and you should now be able to see Situations. Select this and at the top of the window click on the 3 down arrow icon show_situations and click on three of the options and click OK

show situations

Enter a name for the new situation and click the Correlate Situations Across Managed Systems button. I have called the situation WebServerStatus.

correlate

Click OK and you will see the following dialog.

correlate2

Look for the situation we created earlier called WebServers and select it. This will show the servers that have had this situation distributed. Choose the ones that form part of the load balanced service. Click OK.

Note that although we have used 2 systems this tip will equally well scale to 3, 4 or even 1000 systems.

You will also see that our new situation has been created against the Tivoli Enterprise Monitoring Server. This is because all correlated situations must be analysed against the server.

correlate3

Also note that on this occasion when we set the Formula by setting the items to true we need to do thus on teh same line. This is because we only want this situation to fire when both situations are set to true at the same time.

Set the Sampling Interval again but note that you do not need to distribute this situation as it can only run on the Tivoli Enterprise Monitoring Server.

Click OK and the procedure is finished. All we have to do now is test it.

Testing

To do ths I will shut down both agents however I will do this one at a time.

First I shutdown the fibademo OS agent.

This results in 2 events:

  1. MS_Offline on fibademo
  2. WebServers on fibademo

But the Web Servers icon remains green.

Next I shutdown the OS agent on scooby.

This results in 3 events:

  1. MS_Offline on scooby
  2. WebServers on scooby
  3. WebServerStatus on the Tivoli Enterprise Monitoring Server (TEMS)

And checking back at the Service view we now see that the Service Status has been marked as down (red) and all without using a TEC!

service view red

Visits: 55