Cover Story
Lone Star schools boot rivalry
In the school rivalry hall of fame, this one
is legendary. There have been daring team
mascot kidnappings, bonfires, pranks, and
songs and cheers that call for the downfall
of the other team–all for one football game,
"The Lone Star Showdown," each year. When
everyone else comes together peacefully for
Thanksgiving, Texas A&M University (Texas
A&M) and the University of Texas (UT) go
head-to-head on the football field in one of
the nation’s top, and longest-running,
college rivalries–dating back to 1894.
Now, a new development threatens this
long-running competition. The two schools
did the unexpected: They partnered.
"Other than the one weekend every year
when the two schools play each other in
football, people are astonished that the
campuses have such collaboration," says
Wayne Wedemeyer, UT’s director of office
telecommunications services.
Not far from the football fields on
either campus, network engineers from both
schools have each other’s backs. Through a
unique failover plan, the schools ensure
that if one university’s ISP goes down, the
other will pick up the slack.
In Texas, these two universities are the
giants of higher education. Texas A&M
operates nine university campuses, seven
agencies and a health science center. Its
work force adds up to nearly 27,000 people
serving 105,000 students.
UT has nine academic campuses and six
healthcare institutions around the state. It
employs 81,000 people and enrolled 194,000
students in the 2007 academic year.
In recent years, coastal and other south
Texas campuses of both university systems
have been impacted by storms. In 2001,
Tropical Storm Allison put much of downtown
Houston underwater. Hurricane Rita hit the
southeast Texas coast in 2005. These events,
and the risk of outages from other causes,
were of concern to both school systems.
"Each school had its own ISP. If a
provider went down, that school lost
connectivity. It’s happened a few times over
the years," says Willis Marti, director of
networking and chief information security
officer for Texas A&M. "The administration
has stressed the importance of having the
network up 24/7."
As the networks and applications became
more important to the university, a more
resilient network infrastructure was
required. In 2005, the universities were
part of a regional optical network that
increased their connectivity around the
state. As the universities built separate
connections, they each evaluated what
building redundant paths would require.
High price for redundancy
Taking into consideration needed
equipment enhancements (routers and
switches), bandwidth and IT support
requirements, they found that redundancy
carried a high price. Instead of carrying
the additional cost alone, Marti and
Wedemeyer looked into leveraging the
resources of both universities as the
solution for enhancing continuity.
"If one provider goes down, we want to be
able to go to another provider," Marti said.
"We realized we could build redundancy in.
There was no need for both of us to have
contracts with two providers each, so we
decided to share bandwidth around the state
and configure it for physical redundancy for
either university."
Each university increased its bandwidth
with its service provider (Qwest and Level 3
Communications) to accommodate both
university systems, if needed.
"By having a regional optical network, we
essentially figured out we could spend about
the same amount of money and go from 100
megabytes to a gigabyte per second worth of
bandwidth," Marti explains. "It’s not that
much more expensive to buy that extra
guarantee of system availability."
Bridging networks and teams from
different groups is no small task,
especially in a state as big as Texas. The
merging of the two unique and complex
networks required careful planning to
accommodate the different infrastructures in
place at the universities. For example,
Texas A&M uses Cisco 7600 routers, while UT
uses Juniper routers. In addition, the IT
organizations had different skill sets and
experience that had to be addressed.
Instead of trying to fully integrate and
choose a single vendor, Marti and Wedemeyer
chose to merge at one point–at the tip of a
pyramid–using a Juniper router at UT’s
Austin campus. That one router connects the
rest of the network.
"We realized we could build redundancy
in," says Willis Marti, CISO at Texas A&M.
"There was no need for both of us to have
contracts with two providers each, so we
decided to share bandwidth."
The universities implemented a Layer 2
switch topology using Cisco 6500s to share a
physical structure. Each campus owns two
6500s, which they keep at the exact same
settings and the proper virtual LAN (VLAN)
configurations. Instead of a ring
environment, they chose to keep both sides
in the correct configurations to prevent the
spanning tree from shutting off. Each campus
can jointly access the network and make
changes.
Following a staged approach, they first
set up the physical redundancy, then the
Layer 2 redundancy and then merged the
networks. "We flipped College Station on
Monday, Tuesday set up the redundant path
and then Wednesday the rest of the system,
and didn’t have any problems," Marti
recalls.
In total, the process took no more than
three days to complete.
"Once we got the physical connectivity in
place, we turned a couple of switches on and
there we were," Marti says.
Traffic goes where it wants
In what Marti calls "The National Network
of Texas," neither of the universities plays
traffic cop. They simply let the traffic go
where it wants. Some traffic goes through
Texas A&M’s ISP and some goes out UT’s. With
full redundancy, if one source fails
completely, the other takes over.
While the actual cutover went smoothly,
the next test was managing the new
complexity of the network infrastructure. In
the past, Texas A&M had used an Excel
spreadsheet for VLAN control across its 340
buildings on 5,200 acres.
"As far as controlling versions, it was a
very manual process," Marti offers. "You
just hope you get the same vendor."
Since the merged network involved
multiple vendors, the teams chose to keep
the same version of code on switches on
Layer 2 devices interconnecting the
networks, and automate configuration control
so a student could not just sit down at a
console and change things.
To help automate the change process and
manage network configurations, Texas A&M
used NetMRI from Netcordia to identify
versions and configurations across the
network in order to find differences among
them and keep them consistent. The automated
process of managing configuration and change
freed the staff from manual processes, and
allowed Texas A&M to push changes out
quickly, correctly and consistently across
devices.
"NetMRI tells us when something changed
and we can trace it back through access
control to who changed it, when and why, and
change it back if we need to," Marti says.
Both universities realize dramatic time
and cost efficiencies with the shared
network and automated configuration
management. Each pays about the same as they
would just for their own traffic, but has
the protection of physical and traffic
redundancy.
Currently, Texas A&M has eight engineers.
Marti estimates that, without automated
configuration and change management, his
team would need additional engineers.
"One of the things about being a state
institution is we’re not flush with people,"
he says. "Automated configuration and change
management with NetMRI lets us do more with
the same amount of people. Otherwise, we
would need three people going through
hundreds of routers and thousands of
switches."
Likewise, UT has seen a substantial cost
savings as a result of the shared network.
"Instead of a staff of 12, we would have
needed probably 20 to 25 people at campuses
across Texas," Wedemeyer says. "Adding
network costs, that would have been a total
of three to four times more in annual costs.
It’s a very significant cost savings for
us."
Communication is key
To date, individual campuses have had
physical outages a couple of times, but they
have not lost Internet connectivity. Both
Marti and Wedemeyer attribute the overall
success to the detailed planning,
coordination and continuous communication
between both universities.
Marti and Wedemeyer attribute success
largely to planning ahead and understanding
the exact current status of the network
infrastructure. In the Texas A&M and UT
partnership, understanding the situation
before transitioning the networks was
essential, especially with a multiple-vendor
architecture and unique IT staffs.
With distinctly different practices and
vendors, the IT departments at Texas A&M and
UT found transparency and open lines of
communications paramount to success when
bringing together something so large and
complex. Hidden agendas or lack of
communication could have severely impacted
the outcome.
"Be fully transparent to the other entity
about what you want to do," Wedemeyer says.
"Carry on conversations about design
solutions. Be persistent. You have to keep
working at it. In a state as big as Texas,
there were many variables we didn’t plan
for, and we relied on our partner to work
through them."
Wedemeyer and Marti suggest:
- Every
organization should have consistent
policies and procedures for network
configuration and change management. In
a partnership between organizations, the
policies need to be even more defined to
make sure all IT departments understand
and follow a consistent strategy.
- A successful
implementation requires understanding
how the changes impact other aspects of
the network and overall performance.
Simply tracking changes is not enough.
- Monitor and
improve continuously. With such a
widespread and complex network, changes
are always occurring, both planned and
unplanned. Visibility into changes helps
ensure that end-users and organizations
have adequate service levels from the IT
organization.
The merged network opens the doors to
other opportunities for Texas A&M and UT.
This fall, when Super Computing ’08 comes to
Austin, both schools will contribute to
getting the bandwidth across Texas to the
convention center.
The conference highlights the
requirements for Texas A&M and UT to stay on
top of the network configuration and
change-management process as new
requirements are added every day. The power
of following the best practices is not
letting the change take control of the
network, but for the IT organizations to
control configuration and change so they can
take advantage of the partnerships and
leverage the combined infrastructure for
even more benefits.
Ongoing, the campuses are collaborating
more in areas such as distance learning,
remote data center services and disaster
recovery–all bandwidth- and
resource-intensive applications that require
a high level of service quality throughout
the entire state. Both universities expect
that collaboration to continue to grow
between legendary arch rivals.
Just do not tell the football teams or
fans. Nobody wants to spoil a good rivalry.
For more information
(click here)
About Netcordia
Don
Pyle joined Netcordia as CEO in June of
2006, already a 25-year veteran of the
network infrastructure market. Previously,
he was CEO of Laurel Networks, which was
acquired by ECI Telecom in 2005. Pyle also
has held sales leadership positions with
Juniper Networks, Cisco and StrataCom.
Netcordia’s NetMRI network configuration
and change-management software identifies
configuration and policy anomalies and
potential vulnerabilities within large,
complex multivendor infrastructures–and ties
urgency to the applications and business
units at risk. More than 250 healthcare,
financial services, telecommunications,
academics, service and government
organizations use NetMRI.