Special Focus: Testing & Monitoring
Close the loop on network problems
Integration of change awareness
enables correlation of network events with
configuration changes.
by Pam Snaith
Preventing network downtime and performance
degradation is every IT manager's goal.
Causes are not always preventable-major
power outages and other external events
occur-but a considerable amount of
disruption can be prevented. Industry
analysts agree that erroneous network
configuration changes, often manually
entered, cause a significant portion of
network downtime and performance
degradation, perhaps as much as 80 percent.
With so many network faults caused by
configuration changes, a key to preventing
network outages is to ensure that network
management tools are "change aware."
Integration of change awareness into the
network service and fault-management
solution closes the loop on network problems
that stem from configuration changes.
Integration enables correlation of network
events with configuration changes, giving
insight on problematical configuration
changes as part of the root-cause analysis.
It can also provide a configuration audit
trail of any selected network device through
the history that is typically retained by
network-management tools.
Why are configuration changes so often
incorrect?
There are simply so many of them. Device
configuration changes are numerous and
networks teem with routers and switches
that, in the normal course of business, need
configuration adjustments. Typically, they
are from multiple vendors, each with its own
command-line structures. The number of
changes can be daunting, and the variety of
syntax makes management a challenge.
Awareness of configuration
changes will not always prevent
downtime, but it does provide the
opportunity to make corrections quickly.
Network device configuration is detailed
and is often handled by a few experienced
technology professionals. Manual input is
still common, and even seasoned
professionals can make mistakes when keeping
track of so many details.
Discrepancies develop between the startup
and running configurations. This can happen
when the configuration change is correct but
is not saved to non-volatile RAM. In the
event of a device reboot, the device reverts
to the old configuration.
Time delays caused by manual input can
cause problems. If five routers need the
same change, some of them will be done
before others. These incompatibilities may
create problems while the task is in
process. In addition, there is an
opportunity for less-knowledgeable staff to
input a change incorrectly if multiple
people handle configuration changes.
THE BENEFITS OF CHANGE
Preventing network problems, such as
those stemming from configuration changes,
is an opportunity to increase network
uptime, provide better business services and
improve business continuity. There are a few
steps to keep in mind when incorporating
configuration and change awareness into a
network-management solution.
One decision is what solution to deploy-a
niche standalone application or one
incorporated into the network service and
fault-management solution. Here is an
opportunity to unify and simplify overall
network management and provide new
efficiency to the IT staff.
A niche tool only exhibits its excellence
in its native environment. With many
different vendors and platforms, the amount
of serial workflow required is significant
and the opportunity to speed up and clean up
change processes can be lost.
An integrated solution for network
service, fault and configuration management,
with a centralized control point, will not
only enable better network management in the
first place, but also will allow staff to
rapidly spot and resolve configuration
errors. For example, intelligent thresholds
are essential to problem detection and
should include proactive alarming on key
performance indicators for a particular
service, such as voice over IP. With
integrated, change-aware network management,
configuration changes can be correlated with
network events and alarms, resulting in
easier corrections and higher availability
of critical business services that rely on
the IT infrastructure.
Managing configuration changes correctly
takes two key capabilities-awareness and
automation. Network change and configuration
management need to "notice and notify" when
changes are made to network devices.
Awareness of configuration changes will not
always prevent downtime or degradation but
it does provide the opportunity to make
corrections quickly, such as a fallback to a
previous, working configuration. Change
awareness, integrated into the network
service and fault-management solution,
should identify configuration changes in
real time, verify them against established
correct configurations and notify the
correct individual regarding unexpected
changes.
Automation provides the opportunity to
complement awareness with rapid action.
Today's network-management solutions depend
on automation to detect developing
performance problems and to take immediate
action to prevent downtime. While automated
actions should be based on business policies
established by trusted technical advisors,
automating the resulting action eliminates a
great deal of risk. Automation improves both
proactive and reactive change management.
PROACTIVE AUTOMATION
Proactively, automation can implement
scheduled upgrades and deliver immediate
notification of unauthorized changes. Stored
configurations can be uploaded to multiple
devices simultaneously. Changes are
automatically tracked.
Reactively, alerts are automatically sent
to appropriate individuals when changes have
been made to device configurations. This
gives them the opportunity to make
corrections or take other action to ensure
overall network reliability. If problems do
occur, automation can roll back network
device configurations to their last known
good state. Manual corrections could never
be as fast.
If configuration changes are accurate and
timely, many causes of outages are
eliminated. Integration of configuration
management within network service and fault
management helps to bring network
availability to a new level of reliability.
Change awareness ties together fault and
configuration management, simplifying the
growing complexity of managing large
infrastructures and bringing network
management in line with the importance of
the network itself in delivering business
services.
Pam Snaith is product marketing
manager, infrastructure management, at
CA, Islandia, N.Y.
For more information
(click here)
by Gnanesh Dholakia
The rapid proliferation of
virtualization, optimization and Web
services technologies has increased the
complexity of IT infrastructures and changed
the relationship between infrastructure
components, applications and users. The way
current tools view the network no longer
provides the information that is vital to
effective management of business service
delivery. Network behavior analysis (NBA)
systems can provide an effective way to view
the infrastructure.
A number of factors contribute to the
challenge of maintaining satisfactory
performance and availability on an ongoing
basis. Organizational growth, mergers and
acquisitions, the increasing prevalence of
Internet-savvy users, and the proliferation
of rich media mean that network bottlenecks
and slowdowns become more frequent, often
due to bandwidth-hogging applications.
Available monitoring tools, however,
might not be able to keep pace with
increasing infrastructure complexity and
escalating service-level demands. Status
monitoring tools, for example, report on/off
status without indicating why a device is
off or what effect it is having on service
delivery. Performance-monitoring tools tend
to focus on identifying symptoms such as
latency, increases in round-trip time and
jitter, but they do not provide any insight
to the cause of these problems.
The context of the problem needs to be
understood so that the cause can be
identified, affected users can be alerted
and the problem resolved, including:
- whether changes could have caused
service degradation or interruption;
- how that activity differs from
typical behavior;
- what activity led to the problem;
and
- which users, applications, devices,
ports and protocols are involved or
affected.
Organizations should not necessarily
discard the tools in place today and start
from scratch. Rather, organizations should
look to add a new layer of capabilities that
addresses the challenges presented by the
increased complexity and service-level
requirements.
NBA systems analyze network traffic to
provide valuable information about the
interactions of-and dependencies
between-users, applications and systems.
Customers benefit from proactive problem
resolution and reduced mean time to repair,
while ensuring the availability, performance
and security of business services.
NBA systems collect network flow data and
enhance it with application and user
identification and behavioral analytics to
present a complex infrastructure in a
business context. Predefined and
customizable analyses enable users to
identify performance and availability issues
before they disrupt business services.
Role-based presentations enable users
across IT to access this data in a format
tailored to the specific needs and workflows
of security, applications and network teams.
Usage and dependency data enable informed
optimization and change-management
decisions.
NBA systems use all of this information
to intelligently interoperate with other
systems to add value and improve workflow.
They learn from other systems, such as
identity-management systems and traffic
accelerators, to provide business context.
They feed network-management systems and
security event-management systems and update
change and configuration management
databases. They allow other systems to
understand how business services are
delivered across the infrastructure.
NBA systems enable users to manage change
in their IT infrastructures. As a result,
customers are able to ensure the
availability, performance and security of
business services, such as voice over IP,
Web services and enterprise applications, as
well as reduce costs and satisfy regulatory
requirements.
Gnanesh Dholakia is director, product
marketing, Mazu Networks, Cambridge,
Mass.
For more information
(click here)