Special Focus: Storage

From the May 2005  issue of Communications News

Firm manages SAN changes

State Street Global Advisors (SSgA), with $1.4 trillion in assets under management, is the investment-management arm of State Street Corp. SSgA has 27 offices and 10 investment centers across five continents. Three major divisions constitute these securities businesses, with all three groups residing at the State Street Financial Center (SSFC) in Boston. The SSgA storage services team supports the SSFC businesses.

The availability and stability within the SSFC SAN is of critical importance–server outage could cause extreme business and financial impacts to State Street. Therefore, State Street changed control practices across all infrastructure components. The inability to understand the full impact of changes within the SAN, however, increased risk within the environment due to lack of matured instrumentation and processes.

Before implementing predictive change management, all asset inventories within the SAN were tracked. Little or no change analysis was done within the environment and validation of changes was limited. As the environment rapidly expanded, understanding the effect of even small changes became exponentially more difficult.

This configuration-tracking method was difficult to maintain and could not provide any change tracking and simulation functionality. The spreadsheets grew to cover thousands of rows. Changes to the environment were simply updated within the spreadsheet when and if known. Native SAN management tools were frequently scanned for changes and many times issues were not known until an outage occurred. The number of native tools and alerts climbed until desensitization and correlation between alerts was difficult after a change was made.


Tom Dever, enterprise storage engineer at State
Street Global Advisors, Boston.

SSgA’s SAN has grown by 10 storage arrays and hundreds of hosts over the past two years; new technologies have enabled tiered service delivery, and multisite and protocol connectivity. In a move toward continuous operations, traditional disaster recovery site servers can now access storage devices in the primary production sites. This has further supported the need for predictive SAN change management. The environment today consists of 250 servers, six operating systems, 1,500 worldwide names, 1,500 Fibre Channel switch ports, 28,000 logical devices and 14 storage arrays.

The storage team executes on approximately 350 planned changes per year and growing. Without attention to detail and change management, any combination of these physical and logical paths could be changed and data could be corrupted.

“We make extremely large block trades and stability in our storage is essential,” offers Robert Shinn, principal at SSgA. “We can’t have applications down any longer than people can hold their breath.”

The SSgA storage team partnered with Boston-based Onaro to deploy a predictive change-management instrumentation that provided a heterogeneous solution to viewing SAN assets and correlating changes within the environment to set policies. Onaro’s SANscreen is a read-only analysis and validation solution that does not make changes to the environment and is unbiased toward other vendors’ configurations. The storage team is maturing standards and procedures around using the information from this tool and other native tools to simulate changes and communicate the real business risk within the environment, so that it can be mitigated.

“With predictive change management, stability increases from the reduction of vulnerabilities within the SAN,” Shinn says. “When initially tracking the environment manually, we did not have the capability to simulate and validate changes. We never knew how changes would affect the environment and how many vulnerabilities from configuration existed.”

Today, the storage team can report the number of paths within the environment that pose risk from change and configuration problems. When the tools and processes were first implemented, more than 20 hosts had configuration issues. Now that the environment has been reconciled, potential problems are proactively fixed before an outage from change can occur.

Storage configurations can now be validated against standards and root-cause analysis from unplanned changes that are quickly isolated. Developing this level of instrumentation and automation has allowed growth without an increase in operational resources. The team has actually decreased in size over the past year.

For more information from Onaro:
www.rsleads.com/505cn-254

This article was provided by Tom Dever, enterprise storage engineer at State Street Global Advisors, Boston.