Network Monitoring News

October 31st, 2007 by James Cerwinski

Categories: Asset Management, IT Admin, Network Monitoring

Improving Incident Management in an SMB


According to ITIL, An IT incident is an event which is not part of standard IT operations which causes or may cause, an interruption to, or a reduction in, the quality of service. Your objective is to restore an incident to full service as soon as possible.

Your problem in meeting this objective can be broken down into the following three areas.

  • Detection time - Don’t know about an incident until a user complains
  • Diagnosis time - Lack of current information about your network
  • Remediation time - Can’t remotely access a node to restore it to service

 

Your solution should include:

Ability to improve detection time by:

  • Polling for incidents via synthetic transactions
  • Event consolidation and monitoring
  • Monitoring performance thresholds
  • Monitoring performance trends

Ability to improve diagnosis time by:

  • Providing a notification that will inform you of an incident before your user does
  • A page that automatically collects and consolidates what has been happing to a certain node or service in terms of events and performance.
  • A page that also consolidates current configuration and recent changes to a node in terms of software and hardware.

Ability to improve remediation time

  • Anytime, anywhere out of band access via KVM/IP
  • “Virtual Media” capability to remotely mount drives to install software and run diagnostic tests

This is a good first step. There is always more you can do but I will save that for a future blog entry.



Leave a Reply

You must be logged in to post a comment.

Next: Go Pack!

Previous: Where is your traffic headed?