Discovering your devices before moving to the cloud or consolidating data centers

We have been asked by a few companies if they can use Tivoli Application Dependency Discovery Manger (TADDM) to quickly discover what devices they have in a data center before either moving to the cloud or closing data centers due to consolidation. In addition Managed Service Providers (MSP) have requested the ability to discover the devices of a new customer before actively bidding on new work.

The discussions with these potential clients have always hit 2 stumbling blocks. Firstly the way TADDM was licensed made a short term ownership model prohibitively expensive and secondly the pre-requisites that are needed to complete a successful scan of an environment (application and operating system credentials and full network access) are not always readily available. In addition customers would rightly say that if I need to provide credentials then I also need to know the systems that I am providing credentials for and if I knew that I would not need TADDM.

However in both these cases the speed of completing the scan is paramount but the actual depth of the discovery is not important and so this is the starting point for a solution for the following 2 reasons:

  1. TADDM can now be rented on a monthly basis which means if you want the software to resolve a specific issue such as moving to the cloud then you can have the software for as little as 6 months.
  2. If used in the correct way a TADDM level 1 scan discovers basic information about the active computer systems in the run-time environment. This scanning is also known as credential-less discovery as no login information is required to run this. As a consequence level 1 discovery is very shallow and only collects the host name, operating system name, IP address, fully qualified domain name, and Media Access Control (MAC) address of each discovered interface.

To get the most out of TADDM we also adopt a staged process to discover Level 1, Level 2 and Level 3 separately. In this way you get immediate value from the product but also can stop using TADDM when you have the data you need be it a list of devices, configuration items or applications and dependencies.

Deploying TADDM with a 3 stage process

At a high level this process looks something likes this:

TADDM Scan Overview

Level 1 – Device Discovery

To ensure a Level 1 scan phase is successful you need to ensure that TADDM has access to all of the network you are trying to discover. This means that you firstly need a list of all your active sub-nets and you need to install TADDM servers in each location that the sub-nets can be accessed from. Generally in a multi data centre discovery you should add a discovery server in each location as a starting point and then add Windows gateways (for windows discoveries) and anchors (for traversing firewalls) as you find that parts of the network is not accessible. Typically this system is the same physical device to save costs and if the discovery server can access all the network (unlikely) you could also make this a Windows box to combine all 3 functions and further reduce the equipment needed. The TADDM streaming architecture is highly scalable so as the environment becomes larger you can add more discovery servers or storage servers (to access the database) as needed but you can also start out with the minimum configuration (shown in the green box) and revisit this later.

TADDM Architecture

Note that if required we can host the storage and database layer in the IBM Cloud however it makes little sense to host the Discovery Servers anywhere but the location they are trying to discover.

Once this architecture is installed you can then schedule a Level 1 (credential free scan). TADDM level 1 discovery initiates the StackScan Sensor that uses the IDD technology scanner (created in IBM’s Zurich Labs). Optionally it also uses a tool called NMAP for discovery of non-IBM computer systems. The Sensor today understands z/OS®, Z/VM®, OS/400®, AIX®, Solaris™, HP-UX, Linux®, Windows®, Cisco and Alteon however adding in NMAP adds an additional 4,000 signatures and for this reason it is recommended.

The StackScan Sensor performs the following steps:

  1. It sends a small number of low-level ICMP and TCP/IP packets in order to perform its stack analysis.
  2. For the discovery of active ports, rapidly closing TCP connect attempts are issued according to a configurable list of ports.

The results it gets back and uses to match operating systems is held in a customisable file, called fingerprints.conf that contains the rules for device matching.

e.g.

# ICMP Echo tests
# Packet format of outgoing ICMP ECHO:
#       TOS=0x2; DF=1; ICMP code=1; 70 bytes of data (i.e. 01234567890123...)

# Linux Example
icmp_echo_test_df(IP_ECN==@IP_ECN:1;IP_FLAGS==0x0;IP_TTL==64;ICMP_CODE==@ICMP_CODE;[OS=Linux;VERSION=])

# TCP SYN with options to open port
# Packet format of outgoing TCP SYN packet:
#       TOS=0x2, DF=1, TCP options=WinScale 10;NOP;MSS 256; Timestamp;EOL

# AIX Example
# Options = MSS, NOP, WinScale, NOP, NOP, Timestamp
tcp_syn_open_port_test(IP_ECN==0x0:1;IP_FLAGS==0x0;IP_ID!=0x0;IP_TTL==64;TCP_WINDOWSIZE>15000;TCP_WINDOWSIZE<17000;TCP_OPTIONS==0x020103010108;[OS=AIX;VERSION=])

# TCP SYN with options to closed port
# Packet format of outgoing TCP SYN packet:
#       ECN=0x2, DF=1, TCP options=WinScale 10;NOP;MSS 256; Timestamp;EOL

# Windows Example
tcp_syn_closed_port_test(IP_ECN==0x0:1;IP_FLAGS==0x0;IP_ID!=0x0;IP_TTL==128;[OS=Windows;VERSION=])

SYN scan unobtrusively scans thousands of ports per second. This technique is often referred to as half-open scanning, because it does not open a full TCP connection.  A SYN packet is sent as if opening a real connection and then a SYN/ACK indicates the port is listening (open), while a RST (reset) is indicative of a non-listener. If no response is received after several retransmissions, the port is marked as filtered. The port is also marked filtered if an ICMP unreachable error (type 3, code 0, 1, 2, 3, 9, 10, or 13) is received.  SYN scan works against any compliant TCP stack.

For each system it discovers using this method TADDM assigns a confidence level to indicate how certain it is on the accuracy of its discovery. If the operating system confidence level is below the threshold, the operating system is modeled as a general computer system. The threshold is configured between 0 – 100. The threshold can be set using the sensor configuration attribute: confidenceThreshold set in StackScanSensor.xml.

<confidenceThreshold>35</confidenceThreshold>

The image below shows a system that has been discovered at a 75% confidence level and assigned as a Linux system

Level1-Screenshot

The next stage is to run a report to view what sub-nets have been scanned (i.e. sub-nets with 1 or more discovered devices) and from which Discovery server (as in larger configurations a sub-net may only be accessed from 1 discovery server and you need to know where it is available from for future scans). At this stage you will also find out which sub-nets have returned 0 devices from all discovery servers. These sub-nets will need to be assessed to see if they truly have 0 devices or whether they are inaccessible. If it is the latter you will need to either create a new route or add a new anchor. Once the issue is fixed repeat the scanning process until all sub-nets are found. At this stage you can report to the business with some degree of certainty how many devices you have, what the IPs are and which OS they are running (with associated confidence level).

Level 2 – Configuration Item Discovery

Level 1 may be enough for your needs but assuming you need more detail you can now use the Level 1 report to start collecting credentials for the operating systems or network devices devices you have discovered. These may be logins, SNMP credentials or a SSH keys but without these you will not be able to do a Level 2 discovery. An overview of the more basic requirements are shown below.

Operating System Requirements
Windows A service account is required that is a member of the local administrator’s group. This account can be a local account or a domain account. Because TADDM relies on WMI for discovery, the WMI service must be installed and enabled on each target and the service account must have access to all WMI objects on the local machine.
Solaris, Linux, AIX and HP-UX The service account must allow access to all resources on the target computer system that TADDM must discover. The service account must have write access privileges to its home directory on each target computer system. This directory requires approximately 20 MB of free space. During a discovery, scripts and temporary result files can be stored in this directory. After the discovery is run, the files are deleted.

To provide complete information on application configuration and dependencies on UNIX and Linux hosts, TADDM requires the List Open Files (lsof) program to be installed on all target computer systems.

SNMP Credentials TADDM uses SNMP to discover network device configurations. Depending on your SNMP version, the following information is required to facilitate SNMP discovery:

  • SNMP MIB2 GET Community String, with permission for MIB2 System, IP, Interfaces and Extended Interfaces. Vendor Private sysOID or sysDESCR (e.g. Cisco, Extreme etc.)
  • SNMP v3 credentials including user name, password and authentication protocol

Inevitably you will not get all the credentials correct in one go so this will be an iterative process. Personally I prefer to use a new account for both UNIX/Linux and Windows accounts as this will allow you to check quickly if the credentials have been created or not and many people use SSH keys instead of a login for UNIX/Linux. However once you have all the correct credentials you will start to gather some valuable information about the systems including CPU, Memory, paging space, file system layout, installed packages running services and network card information. An example screenshot of a system is shown below.

Level2-Screenshot

Level 3 – Application and Dependency Discovery

TADDM Level 3 scanning discovers detailed information about the application infrastructure, deployed software components, physical servers, network devices, virtual systems, and host data that are used in the run-time environment. This scanning is also known as credentialed discovery, and it requires both operating system credentials and application credentials.

Assuming you have successfully run a Level 2 scan it is a worthwhile exercise to run a Level 3 scan before collecting the application credentials as this will initiate the applications sensors based on the ports and processes discovered as part of a L2 scan. The list of the launched sensors will enable you to define the application credentials in TADDM for all the applications that have been identified. Once these have been collected which may take a while (depending on how many applications and application owners there are) you can add the new credentials and configure the sensors as appropriate.

You may also find that some applications do not have standard sensors available and in this case you can create custom templates to ensure that everything is discovered. You can also choose to capture configuration files so that changes in your own applications are captured in the same way as the IBM supplied sensors.

Again the scanning at L3 will be an iterative process and will take many attempts to get the correct credentials in place however when this is successfully completed you can start to view your business applications.

In the diagram below you can see a Level 3 quick discovery of the TADDM infrastructure and the discovered connections at the time the discovery was run.

topology

License Rental option for short term discovery

From a pricing point of view if a business wanted to implement a discovery solution using TADDM for a short term deliverable (e.g. a Data center migration inventory) the perpetual pricing route could be prohibitive as IBM charge a single install license and a price per device. Depending on the number of devices you have the install license was likely to make up the majority of the cost of the software purchase price.

However if the project is planning to deliver between 6-12 months (in fact anything less than 24) then the rental option is considerably cheaper. The base cost is still there but it is now a smaller monthly charge and as you now only pay for the devices as and when you use them you do not start paying for all your devices from day one.

There is still a charge for our service to get the software deployed and running but this would be there anyway and because we have many years of project experience we will be able to deploy the project in the shortest time possible (not with standing that the pre-requisites MUST be met).

If you are interested in a quote or a quick discussion on how TADDM works then please contact me at simon.barnes@orb-data.com.

Visits: 204