IT Operations – The Unsung Heroes
Saturday, November 3, 2012 at 1:30PM
Gary L Kelley in Co-Location, Disaster Recovery, IT, Operations, Squarespace

This is a story of how one company and its operations staff kept the lights on in the face of post-Hurricane in New York City.

Three blogging websites are under my general control:

Squarespace is used to “host” these sites.  WordPress was our original choice, and served us for a while.  I was just never a fan for how WordPress “thinks.”  Personally, I prefer Squarespace.

Squarespace uses PEER1 as their co-location provider, located in New York at 75 Broad Street.

You can imagine my personal dismay when Tuesday, October 30 at 11:34AM I got the following message from Squarespace:

I have some unfortunate news to share. Our primary data center, Peer1, in Lower Manhattan lost power yesterday at about 4:30PM local time. At that time, we smoothly made the transition to generator power and took comfort over the fact that we had enough fuel to last three to four days. (Peer1 stayed online during the last 3 major natural disasters in the area, including a blackout that lasted for days.)

At 8:30PM yesterday, we received reports that the lobby in the data center’s building was beginning to take on water. By 10:30PM, as is sadly the case in most of Lower Manhattan, Peer1’s basement had experienced serious flooding. At 5AM, we learned our data center’s fuel pumps and fuel tanks were completely flooded and unable to deliver any more fuel. At 8AM, they reported that the generators would be able to run for a maximum of four more hours.

Unfortunately, this means that Squarespace will be offline soon (our estimate being at 10:45 AM today).

I then did what any IT ops person would do…and notified my users of this outage:


Of course, I then did what any user would do, and emailed Squarespace support (like they had time for me.)

Can you guys toss up a graphic of some kind so people accessing my sites won’t get a dns error?

(Also, when you’re back there is nothing to do)?

An amazingly fast 26 minutes later, I had a response:

Great question! We will have a holding page up (hosted outside of Squarespace) that will provide messaging about the downtime. Any customers trying to access sites during that time will see that message. Once we are able to bring the system back up, there will be nothing required of you in order for your website to come back online. We expect sites to be available for another 45 minutes at least and please keep an eye for updates on Twitter (@Squarespace, @Squarespacehelp) as we will be providing updates as regularly as possible from there. 

Hope this helps!

Shaun H

Of course, being rather chatty, I responded with:

I will expect something very creative….

Like an overhead view of Sandy going down a toilet bowl. (Trying to bring a smile to your face during this tough time.)

Think the Squarespace version of the Twitter flying whale.

And again, had a quick response

Hey Gary,

That’s a great and hilarious suggestion, thank you :) We will definitely keep you updated on our Twitter accounts and Blog page (for as long as we can):

https://twitter.com/squarespace

and

http://blog.squarespace.com/

Hope this helps.

Paulina V.

What then followed was something I find speaks to the spirit of a team focused on service. 

They carried fuel to the generator on the roof.  17 floors.  All by hand.  Squarespace, their co-lo provider PEER1, another company Fog Creek (an online project management firm for collaborative software development) and some hired contractors carried the fuel to the 17th floor where it could be pumped up to the generator on the roof (18th floor).

Ok, let’s do some math.  According to PEER1’s Meredith Eaton, a company spokeswoman, the generator’s consumption rate was about 40 gallons/hour.  That’s eight 5 gallon pails an hour.  At 7.15 pounds/gallon diesel, that’s 286 pounds an hour up 17 floors.  And they did this for a couple days…so at 48 hours this is 13,728 pounds of fuel, or nearly 7 US tons of fuel.

The following pictures are used with permission of Squarespace:

 Thirsty generator, on the roof above 17 floorsBasement level, where the fuel is supposed to be stored

 

Diesel fuel on street waiting for a lift

Part of the bucket bridgade

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Now, many would argue Squarespace would be better off with a second data center somewhere with automated failover.  That would carry an increased cost, something this author wouldn’t be willing to pay for.  Disaster recovery desires must be analyzed in light of the costs.  It’s almost laughable to consider a Recovery Time Objective or Recovery Point Objective for these blogs.  If they are down for days, frankly it wouldn’t matter.  These blogs are not time sensitive, with the closest financial impact being on the Mark Fidrych Foundation with donation ability (I encourage you to use!)

So due to the heroic efforts of the unsung IT Operations and associated people, PEER1 stayed up, and you are able to enjoy reading this post.

My hat goes off to these people who persevered, with determination and grit, to keep the site going.  In a word, amazing.  I find IT organizations do this often to keep the ship afloat, often without complaint.

Will we continue hosting on Squarespace?  You betcha.

What stories do you have of heroic IT efforts?

One midnight shot of a total bucket brigade

Article originally appeared on Gary L Kelley (http://garylkelley.com/).
See website for complete article licensing information.