Escalating Events based on Business Criticality

IBM Tivoli Netcool/Impact provides a common platform for data access that circumvents organizational boundaries. It enhances OMNIbus solutions to allow data from virtually any source, to correlate, calculate, enrich, deliver, notify, escalate, visualize and perform a wide range of automated actions.

This technical article demonstrates how information held in a MySQL database can be used to automatically escalate an alert based on the relative criticality to the business.

The EIF probe and postzmsg commands are used to generate test alerts.

Configuration Data

The data that we will used to determine the severity of an incoming alert will be held in an external database. Instructions for creating the database and sample data in either MySQL or DB2 are given below:

MySQL

mysql> create database cmdb;
mysql> use cmdb;
mysql> create table Device ( Hostname VARCHAR(255), Facility VARCHAR(255) );
mysql> create table Department ( DeptName VARCHAR(255), Location VARCHAR(255) );

mysql> insert into Device values ( 'server1','Crewe' );
mysql> insert into Device values ( 'server2','Nantwich' );
mysql> insert into Device values ( 'server3','Winsford' );
mysql> insert into Device values ( 'server4','Winsford' );
mysql> insert into Device values ( 'server5','Crewe' );
mysql> insert into Device values ( 'server6','Northwich' );

mysql> insert into Department values ( 'Engineering', 'Nantwich' );
mysql> insert into Department values ( 'HR', 'Crewe' );
mysql> insert into Department values ( 'Operations', 'Winsford' );
mysql> insert into Department values ( 'Catering', 'Northwich' );
mysql> insert into Department values ( 'Facilities', 'Crewe' );

DB2

db2 => create database cmdb
db2 => connect to cmdb
db2 => create table Device ( Hostname VARCHAR(255), Facility VARCHAR(255) )
db2 => create table Department ( DeptName VARCHAR(255), Location VARCHAR(255) )

db2 => insert into Device values ( 'server1','Crewe' )
db2 => insert into Device values ( 'server2','Nantwich' )
db2 => insert into Device values ( 'server3','Winsford' )
db2 => insert into Device values ( 'server4','Winsford' )
db2 => insert into Device values ( 'server5','Crewe' )
db2 => insert into Device values ( 'server6','Northwich' )

db2 => insert into Department values ( 'Engineering', 'Nantwich' )
db2 => insert into Department values ( 'HR', 'Crewe' )
db2 => insert into Department values ( 'Operations', 'Winsford' )
db2 => insert into Department values ( 'Catering', 'Northwich' )
db2 => insert into Department values ( 'Facilities', 'Crewe' )

Build the solution

Create a new project

– Log into Impact as the admin or other suitably permissioned user
– Select the NCI (assuming the default names have been used) server instance if you need to
– Select the Projects tab
– Click the New Projects + icon
– Enter a project name (orbEventEnrichment)
– Click OK

Create the Event Source

– Make sure the orbEventEnrichment project is selected in the Projects drop down box
– Drop down the Data Sources And Types menu
– Select ObjectServer
– Click the + icon
– Enter the Data Source Name as NCOMS
– Enter the Username as root
– Disable Backup
– Enter the hostname where the ObjectServer resides for the Primary Source
– Click the Test Connection button to make sure everything is OK
– Click OK

Create the Data Source to access the cmdb database (MySQL)

– Drop down the Data Sources And Types menu
– Select MySQL
– Click the + icon
– Enter the Data Source Name as CMDB
– Enter an appropriate username and password
– Disable Backup
– Enter the Host Name, Port and the Database as cmdb
– Click the Test Connection button to make sure everything is OK
– Click OK

Create the Data Source to access the cmdb database (DB2)

– Drop down the Data Sources And Types menu
– Select DB2
– Click the + icon
– Enter the Data Source Name as CMDB
– Enter an appropriate username and password
– Disable Backup
– Enter the Host Name, Port and Database as cmdb
– Click the Test Connection button to make sure everything is OK
– Click OK

Create the Device Data Type (MySQL)

– Drop down the Data Sources And Types menu
– Click the + icon next to CMDB
– Enter Devices as the Data Type Name
– Make sure CMDB is selected as the Data Source Name
– Make sure the Enabled checkbox is selected
– In the Table Description section, select cmdb from the Base Label drop down box
– Select Device from the drop down box next to it
– Click Refresh, this should bring back the table fields
– Make Hostname the key field
– Select Hostname as the Display Name Field
– Click the Save icon (floppy disk) and then close the tab

You should end up with something like this:

impact1

Create the Device Data Type (DB2)

– Drop down the Data Sources And Types menu
– Click the + icon next to CMDB
– Enter Devices as the Data Type Name
– Make sure CMDB is selected as the Data Source Name
– Make sure the Enabled checkbox is selected
– In the Table Description section, enter Device into the Base Label text box
– Click Refresh, this should bring back the table fields
– Make Hostname the key field
– Select Hostname as the Display Name Field
– Click the Save icon (floppy disk) and then close the tab

You should end up with something like this:
impact2

Create the Department Data Type (MySQL)

– Drop down the Data Sources And Types menu
– Click the + icon next to CMDB
– Enter Department as the Data Type Name
– Make sure CMDB is selected as the Data Source Name
– Make sure the Enabled checkbox is selected
– In the Table Description section, select cmdb from the Base Label drop down box
– Select Department from the drop down box next to it
– Click Refresh, this should bring back the table fields
– Make DeptName the key field
– Select DeptName as the Display Name Field
– Click the Save icon (floppy disk) and then close the tab

Create the Department Data Type (DB2)

– Drop down the Data Sources And Types menu
– Click the + icon next to CMDB
– Enter Department as the Data Type Name
– Make sure CMDB is selected as the Data Source Name
– Make sure the Enabled checkbox is selected
– In the Table Description section, enter Department into the Base Label text box
– Click Refresh, this should bring back the table fields
– Make DeptName the key field
– Select DeptName as the Display Name Field
– Click the Save icon (floppy disk) and then close the tab

Create a Dynamic Link

This will establish a relationship between the Device and Department information. In this example, Device Facility and Department Location are related i.e. if a device at a facility fails, a department at the same location will be impacted.

– Drop down the Data Sources And Types menu
– Click the Devices Data Type (the actual word Devices)
– Select the Dynamic Links tab
– Click the New Link by Filter icon impactNewLink– Select Department as the Target Data Type
– Enter a filter of Location = '%Facility%'
– Click OK
– Click the Save icon and then close the tab

Test the Link

– Drop down the Data Sources And Types menu
– Click the View Data Items icon impactLog for Devices
– Click the view linked data icon impactMagGlass for one of the servers
– This should bring a window showing the associated Departments

impact3

Create the Policy

For this example, the Operations department is considered to be critical to the business. If any devices located at the same facility fail, we want to automatically increase the severity of the associated alert.

To do so the policy must first determine the facility of the device by querying the Device data source. Then it must find the departments at that location, this time by using the dynamic link.

Finally, each department at the location is checked. If the Operations department is impacted, the severity of the alert is increased.

– Drop down the Policies menu
– Select the Custom template
– Click the + icon
– Name the policy orbEventEnrichment and click Save

Writing code within the browser editor can be a bit awkward, especially when it comes to indentation. You may find it easier to use an other editor, such as vi, and then copy and paste the code in. To help with indentation I would recommend expanding tabs, which can be done in vi with the following settings (add to .exrc file to make permanent):

set expandtab
set shiftwidth=4
set softtabstop=4
set tabstop=4

The policy code:

,

/*
    Policy: orbEventEnrichment
    Author: Ant Mico
    Date  : February 2009
    Desc  : Sample policy demonstrating so key Impact policy
            functionality. Loosely based on the example given
            in the Solution Guide (with the errors removed!).
*/
// Set up some variables that are used by the logging function
policyName = "orbEventEnrichment";
debugLevel = 1;
// Log a start up message
// Note the use of a library policy which contains regularly used functions
// This is a normal policy, just need to fully qualify the function to access it
orbFunctionLibrary.orbLogger(debugLevel, policyName, "START");
// Query the Devices DataType
// We assume that the Node field in the Omnibus alert correlates with the Hostname
dataType = "Devices";
filter = "Hostname = '" + @Node + "'";
countOnly = False;
// GetByFilter will return an array with the matching Data Items
devices = GetByFilter(dataType, filter, countOnly);
// The Length() function returns the number of elements in the array
If ( Length(devices) < 1 )
{
    orbFunctionLibrary.orbLogger(debugLevel, policyName, "No devices found.");
}
Else
{
    index = 0;
    While ( index < Length(devices) )
    {
        msg =  "Device " + devices[index].Hostname + " is in the " + devices[index].Facility + " facility.";
        orbFunctionLibrary.orbLogger(debugLevel, policyName, msg);
        index = index + 1;
    }
    // Now we can use the link to get the impacted Departments
    // Create an array in which the target DataType is stored
    dataTypes = { "Department" };
    // Set the filter and maximum rows to return
    filter = NULL;
    maxToReturn = 10000;
    departments = GetByLinks(dataTypes, filter, maxToReturn, devices);
    If ( Length(departments) < 1 )
    {
        orbFunctionLibrary.orbLogger(debugLevel, policyName, "No departments found.");
    }
    Else
    {
        index = 0;
        While ( index < Length(departments) )
        {
            // Store the array element in a separate variable to
            // make accessing it easier
            dept = departments[index];
            msg = "Department " + dept.DeptName + " is impacted.";
            orbFunctionLibrary.orbLogger(debugLevel, policyName, msg);
            // Check to see if it is the Operations department
            If ( dept.DeptName == "Operations" )
            {
                // It is, so set the event Severity field to 5 (Critical)
                // Note that the @ syntax is shorthand for the EventContainer
                // variable which contains the event under consideration.
                // The other way to access it would be EventContainer.Severity
                @Severity = 5;
                // This next bit is important.  Once the event has been
                // changed we must return it so that it gets updated in
                // Omnibus.  Note the use of the EventContainer variable.
                ReturnEvent(EventContainer);
            }
            index = index + 1;
        }
    }
}
orbFunctionLibrary.orbLogger(debugLevel, policyName, "FINISH");

,

Note the use of the orbLogger function, which resides in a policy called orbFunctionLibrary. This is just another normal policy but it demonstrates how functions can be referenced across policies. This is very useful for grouping regularly used bits of code.

The orbFunctionLibrary policy:

,

/*
    Policy: orbFunctionLibrary
    Author: Ant Mico
    Date  : February 2009
    Desc  : Group of functions that may be useful to other policies
*/
/*
    Function to implement standard policy logging.
    Takes three arguments:
        - debugLevel, an Integer 0 (off) or 1 (on)
        - policyName, a String containing the name of the policy
        - message, a String containing the message to log
    debugLevel could be extended to include more verbose
    information like CurrentContext() etc.
*/
Function orbLogger(debugLevel, policyName, message)
{
    debugLevel = Int(debugLevel);
    If ( debugLevel > 0 )
    {
        Log(LocalTime(getDate()) + " " + policyName + " : " + message);
    }
}

,

Create the EventReader service

This is the service that will read events from the ObjectServer.

– Drop down the Services menu
– Select OmnibusEventReader from the menu
– Click the + icon
– Enter the service name as orbOmnibusEventReader
– Change the Data Source to be NCOMS
– Select the Event Mapping tab
– Click the New Mapping button
– Enter the Filter Expression Node LIKE ‘^server[0-9]+$’ so that only alerts from servers that match this filter will trigger the policy
– Select the orbEventEnrichment policy as the policy to run
– Check the Active checkbox
– Click OK
– Click OK
– The service should appear in the Service Status window at the bottom left of the page, as shown below:

impact4

Start the orbOmnibusEventReader service

– Click the start button impactStart– Click the log button to see what it is doing impactLog– You should see the EventReader periodically querying the ObjectServer for alerts

Generate test events

Using postzmsg or equivalent generate some test alerts:

postzmsg -f $OMNIHOME/bin/tec.cfg -r WARNING -m "Hardware failure detected" hostname=server1 origin=server1 Device_Down TEC

postzmsg -f $OMNIHOME/bin/tec.cfg -r WARNING -m "Hardware failure detected" hostname=server2 origin=server2 Device_Down TEC2

postzmsg -f $OMNIHOME/bin/tec.cfg -r WARNING -m "Hardware failure detected" hostname=server3 origin=server3 Device_Down TEC3

You should see the alert from server3 being escalated to a Critical severity.

The log for the PolicyLogger service should contain the following types of entries:

Parser log: 2009-02-19 17:10:08.000 orbEventEnrichment : Device server1 is in the Crewe facility.
Parser log: 2009-02-19 17:10:08.000 orbEventEnrichment : Department HR is impacted.
Parser log: 2009-02-19 17:10:08.000 orbEventEnrichment : Department Facilities is impacted.
Parser log: 2009-02-19 17:10:08.000 orbEventEnrichment : FINISH
Parser log: 2009-02-19 17:10:57.000 orbEventEnrichment : START
Parser log: 2009-02-19 17:10:57.000 orbEventEnrichment : Device server2 is in the Nantwich facility.
Parser log: 2009-02-19 17:10:57.000 orbEventEnrichment : Department Engineering is impacted.
Parser log: 2009-02-19 17:10:57.000 orbEventEnrichment : FINISH
Parser log: 2009-02-19 17:11:21.000 orbEventEnrichment : START
Parser log: 2009-02-19 17:11:21.000 orbEventEnrichment : Device server3 is in the Winsford facility.
Parser log: 2009-02-19 17:11:21.000 orbEventEnrichment : Department Operations is impacted.
MP.returnEvent did eri.putEvent for EventContainer: (OwnerUID=65534, Class=6601, Service=, Serial=430, RemoteSecObj=, TECFQHostname=, LocalNodeAlias=, TaskList=0, TECEventHandle=, PhysicalPort=0, NmosEntityId=0, LocalPriObj=, NmosObjInst=0, TECDate=, LocalRootObj=, EventId=, Flash=0, ProcessReq=0, TECHostname=, RemoteRootObj=, ExpireTime=0, SuppressEscl=0, ReceivedWhileImpactDown=0, InternalLast=1235063480, Grade=1, TECStatus=, Node=server3, RemoteNodeAlias=, RemotePriObj=, TECServerHandle=, Severity=5, ExtendedAttr=, StateChange=1235063480, KeyField=430, Acknowledged=0, NmosManagedStatus=0, FirstOccurrence=1235063480, ServerName=NCOMS_A, URL=, Poll=0, PhysicalCard=, NmosSerial=, Identifier=:TEC3:Device_Down, OwnerGID=0, LastOccurrence=1235063480, X733ProbableCause=0, Agent=TEC3, AlertGroup=Device_Down, PhysicalSlot=0, NmosDomainName=, NmosCauseType=0, Summary=Hardware failure detected, Tally=1, TECRepeatCount=0, NodeAlias=server3, Location=, Type=1, LocalSecObj=, X733SpecificProb=, Manager=tivoli_eif probe on carl, X733EventType=0, Customer=, AlertKey=TEC3, EventReaderName=orbOmnibusEventReader, ServerSerial=430, X733CorrNotif=, TECDateReception=)
Parser log: 2009-02-19 17:11:21.000 orbEventEnrichment : FINISH

Visits: 125