|
NETWORK MONITORING From the August 2006 issue of Communications News |
Shorten the problem lifecycle by Eileen Haggerty The process of isolating and fixing network-based performance problems frequently can be slow, labor-intensive and heavily dependent on the capabilities of the person completing the troubleshooting. The typical approach in most enterprises today is to set static thresholds manually, which requires repetitive adjustments on an often substantial number of monitored network segments, virtual circuits and/or applications. Such threshold alarms are often ineffective, however, because they have been de-tuned or set at excessively high levels to prevent operations teams from being barraged by too many false alarms. Troubleshooting and resolution efforts are also hindered by the lack of objective, clear information to help localize the root of problems. As networks, users, applications and traffic volumes continue to grow in both numbers and complexity, the process of performance alarming needs to evolve through automation. Ruling out the network as the source of problems is step one, followed by troubleshooting the applications, servers, databases and any other element in the path of the affected application traffic flow. Eventually, the root cause of the problem is discovered, corrective action is taken, and normal service is restored. But how long did it take to resolve that problem? Many established enterprise applications are based on a simple client-server model, with easily defined communications paths. Emerging applications, however, are increasingly distributed and multitiered, designed to yield significant benefits in reusability, scalability and responsiveness to the needs of the business. Unfortunately, these benefits can be negated, and service performance and availability compromised, by the added complexity of how they communicate with the end-users across the network. This is due to the interactions and dependencies between multiple service elements, and contention for shared and virtualized resources used by the applications over the network. This new concept of “the application fabric” highlights the importance of the interconnected virtualized resources across the physical infrastructure (Web servers, switches, routers, application servers and storage) that are shared by both traditional and emerging applications. While infrastructure companies are making attempts to deliver an application fabric that is more robust via faster network connections, a new performance-management approach is required that goes beyond the limitations of management tools designed for client-server application architectures and network equipment alone. New approaches will need to monitor the behavior of applications in conjunction with the underlying fabric resources, while simultaneously addressing the scalability and responsiveness challenges inherent in managing massively distributed and virtualized application infrastructures. This approach will give operators a head start in reducing the length of application and network degradation problem lifecycles. Visibility is required for optimizing performance of the company’s customer- and revenue-affecting applications and business services. This is accomplished by leveraging intelligence already available in the enterprise network, such as SNMP management information bases, in combination with distributed network-based instrumentation dedicated to monitoring every voice and data application and conversation in the network. A software application that analyzes the information collected by this network intelligence should deliver a unified set of features and functions covering multiple performance-management tasks, including application, voice and network monitoring, troubleshooting, response-time analysis, capacity planning, and reporting. Analyzing traffic flows on a segment and all its associated virtual circuits and/or quality-of-service classes to discover changes or anomalies in traffic volume and type will alleviate setting multiple static threshold alarms. It will also discover the root cause of the problems as they emerge, by analyzing packet-level details for traffic type, as well as application details (SAP, Citrix, HTTP) to pinpoint precisely what is increasing the traffic volume. Eileen Haggerty is director of solutions marketing for NetScout
Systems, Westford, Mass. |