Data Center Exposure and Recovery in New York City

Data Center Exposure and Recovery in New York City

Hurricane Sandy provided a fascinating opportunity to study the both the level of disaster planning and the resilience of New York City data centers. This article will examine a) what actually happened, b) what was the risk, and c) what are the lessons learned.

What Actually Happened?

Simply put, data centers in New York were caught off guard. Consider these incidents.

Internap and Peer 1, located at 75 Broad Street, suffered basement-level flooding which knocked out diesel fuel pumps.

 

Datagram, located at 33 Whitehall, experienced the exact same problem – 5 feet of water in the basement. As a result several high profile blogs and numerous websites went dark.hurricane-sandy

Both of these facilities are located a Zone A flood zone. Zone A is FEMA’s second highest risk category.

Then there were fuel supply issues. Fog Creek who makes and hosts Trello, Copilot and other popular platforms is in Peer 1 had to assemble a bucket brigade to carry diesel fuel up 17 stories to refuel a generator at Peer 1. As a precaution Trello was moved to Amazon Web Services and it seems to have suffered limited downtime, but the bucket brigade was required.

Shoretel, the VoIP provider, had 3 data centers – all in lower Manhattan, including 75 Broad St which did successfully switch over to generator power but due to “city restrictions” they had shut the generators down. 700 customers went down.

Fortunately, things did not get worse for Fog Creek, but carrying 5 gallon buckets of diesel fuel up 17 stories in a building with power problems strikes us as a recipe for something truly horrible.

 

squarespace-75broad-bucketTeams from Squarespace fill buckets with diesel fuel to haul them up 17 stories to the generator keeping the data center online. Staff from Peer 1, Squarespace and Fog Creek Software have formed this unusual Internet bucket brigade. (Photo via Squarespace)

A typical rack of servers requires 5 to 10 KW of power including cooling/HVAC. Typical data centers range in size from 5,000 to 40,000 square feet. A mid-sized facility at 20,000 SqFt would house about 600 racks. That equates to roughly 5 megawatts (MW) of power. A reasonably efficient diesel generator would require roughly 200 gallons of diesel per hour to push out 5 megawatts – that’s a bit over 3 gallons per minute.

Typically data centers tell us they have 1 week of diesel onsite and a resupply contract. A full week for a 20,000 SqFt data center is 34,000 gallons. We suspect that in lower Manhattan, the standard was more like 1 day. Then resupply problems hit because of the street flooding, and road and bridge closures.

 

What was the Risk?

The Mid-Atlantic States do not see nearly as many hurricanes as the Southeast and the Gulf Coast of the United States. The average return period for hurricanes within 50 miles of New York City is 18 to 19 years.

For the largest part of Hurricane season the Typical Hurricane Tracks, as observed by NOAA, take these storms out to see at the more northern latitudes of the NYC area.

Here are the July, August and September typical tracks:

july-hurricane-track

august-hurricane-track

september-hurricane-track

But look at how this changes in October:

october-hurricane-track

 

And notice how closely Hurricane Sandy lined up with the typical October track.

hurricane-sandy-track

 

Finally, what about the frequency of storm origin in October? Compare below the frequency map for August 21 – 31 origin, which is the peak of Hurricane Season, to the October 11 – 20 origin map below:

august_21_31_origins

october_21_31__origins

You can see that activity is less in October, but it’s hardly dormant as it is a few weeks later:

november_21_30__origins

Just as August and September are the periods of greatest risk in the Southeast and the Gulf Coast, October clearly presents the greatest risk of hurricanes in NYC.

What is the solution?

If these providers had built to the following standards, downtime would have been minimized:

  • One week of fuel for standby power onsite
  • Resupply plan for fuel in place – or
  • A redundant or backup site more than several hundred miles away

For any disaster recovery, hosting or colocation solution, we would look to the Uptime Institute who publishes the Data Center Site Infrastructure Tier Standard for Operational Sustainability.

Based on their standard, we’d offer the following. Red indicates higher risk profile of Lower Manhattan.

Disaster Risk Component Higher Risk Lower Risk
Flooding and Tsunami < 100 Year Flood Plain > 100 Year Flood Plain
Hurricanes and Tornadoes High Medium
Seismic Activity Zone 3 or 4 Zone 2A or 2B
Airport/Military Airfield < 3 miles from active runway > 3 miles from active runway
Adjacent Properties Chemical plant, etc. Office buildings, land
Transportation Corridors < 1 mile > 1 mile

 

To review your site’s risk of various natural disasters, see our Natural Disaster Risk Maps.

 

Disaster Recovery as a Service

Business Pandemic Plan – Are You Prepared?

Business Pandemic Plan – Are You Prepared?

On June 11, 2009, the World Health Organization (WHO) raised the worldwide pandemic alert level to Phase 6 due to the spread of the H1N1 virus.  The virus, also known as Swine Flu, has rapidly established itself and will continue to persist in the coming months as the virus continues to move through susceptible populations. The Centers for Disease Control (CDC) has released a guidance report which recommends actions that employers should take now to decrease the spread of seasonal flu and 2009 H1N1 flu in the workplace and to help maintain business continuity during the 2009-2010 flu season.  The document states that employers who have developed pandemic plans should revise their plans in light of the current 2009 H1N1 influenza outbreak to take into account the extent and severity of disease in their community.  CDC anticipates that more communities may be affected than were in the spring/summer 2009, and/or more severely affected reflecting wider transmission and possibly greater impact. (more…)
NYSE in Chaos

NYSE in Chaos

nyse-21581327_sA computer glitch stopped trading for 40 minutes in more than 200 stocks at the New York Stock Exchange today.  One of the key computer servers used to conduct trading lost connectivity to the trading network.  As a result, stocks such as General Electric and ExxonMobil were unable to be traded.

Until a backup server was put in place, trading in those shares was stopped on the floor and the world witnessed the NYSE in chaos.  Any trades sent to the NYSE during the time of the outrage were immediately cancelled and the brokers were notified.  If a trade was denied during the time the NYSE’s system was down, it would have been rerouted temporarily to an alternative market or exchange.  NYSE spokesman Ray Pellecchia stated, “We’ll review the problem tonight, but it’s back to business as usual.”

While we recognize the complexity of running the NYSE is vast, it seems like anything this important to the U.S. and world economy should have better data protection and continuity plans in place.  One has to wonder the cost in lost trades and missed opportunities.

Sensitive Data Missing From National Archives

Sensitive Data Missing From National Archives

The National Archives lost a computer hard drive containing massive amounts of sensitive data from the Clinton administration. The drive went missing from the Archives facility while Archive members were converting the Clinton administration information to a digital records system. The drive contained 1 terabyte of data, which is enough information to fill millions of books. The data ranged from Secret Service and White House operating procedures to social gatherings and political records, in addition to Social Security numbers.

The hard drive was located in an area where at least 100 badge-holders passed. National Archives members have known that the drive was missing since March 24. It has not yet been determined if the loss was the result of theft of accidental loss.

The National Archives is now offering a $50,000 reward for information leading to the recovery of the missing computer hard drive.

Terrorist Attacks – History Repeats Itself

Terrorist Attacks – History Repeats Itself

In July of 1993, eight individuals were arrested and later convicted for plotting terrorist attacks on key sites in Manhattan.  Such key sites included: the St. Regis, the Waldorf-Astoria and the UN Plaza hotels, as well as the Holland and Lincoln tunnels.  VP of Counter Terrorism and Corporate Security at Stratfor Fred Burton stated, “The militants planned to storm the island armed with automatic rifles, grenades and improvised explosive devices.” Luckily this specific plan, which was later identified as the Landmarks Plot, failed.  The goal of the attack was to kill as many people as possible. (more…)