Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Internap evacuates LGA datacenter (pastebin.com)
71 points by brokentone on Oct 30, 2012 | hide | past | favorite | 37 comments


Stackoverflow and Stack Exchange are usually hosted in that building at peer 1, but we failed over to our secondary dc out in OR ealier in the evening.


It's interesting to see how a storm can singlehandedly cut out some major websites and cause millions to go without power. I think it shows how fragile this tech world really is, and how little we should rely on it in case of a major disaster.


Don't underestimate "a storm". A hurricane releases about 1.3 x 10^17 joules/day [1] of raw kinetic energy (wind), and that's not including rain and flooding. That's equivalent to 31 megatons per day. The most powerful US nuclear weapons today are about 25 megatons.

[1] - http://www.aoml.noaa.gov/hrd/tcfaq/D7.html


At least with a storm, it's somewhat predictable so one can take action before it hits the datacenter. Nobody was able to predict the outage at Amazon's datacenter which took out large sites like Airbnb & Reddit almost instantly.


Here's the part I don't understand. This was a record 13 foot storm surge. Is it true that, had the pumps been located above the storm surge, service would have continued uninterrupted? It sounds like everything else in their data center is just fine.

This makes it seem like there was a serious design flaw. These pumps were a critical part of the system, but were located in a vulnerable area.


I'm guessing a complete evacuation of the building would have been required regardless, which makes things a little trickier.

Would they be able — or would it be advisable — to leave the fuel pumping, generators running and all operations unaffected without any staff on site?


I really don't know what the proper procedure should be in a data center when the big klaxon goes off and you here those fateful words over the intercom, "Close all doors, DIVE, DIVE, DIVE!"


If it is a halon equipped datacenter you listen. And then do as it says. Quickly.


So true, my first experience with halon was a briefing at IBM about their machine room policies (I was a student intern) and they stressed that when the fire alarm went off, even if you knew there wasn't a fire, you evacuated the room because when the halon dumped if you didn't have your own source of oxygen it was game over.


F*$cked

Ironic they tweeted this re AWS 12h ago: Internap ‏@Internap Could 'Frankenstorm' Lead To Another AWS #Outage? http://onforb.es/Vtr45J


Did you by any chance save that? They deleted it…


Clouds are bad for cloud computing.


Virtual machines can migrate at lightspeed?


Update at 10:17ET http://pastebin.com/NUQNHHJi Interesting they don't have a public status page of any sort.


Lifehacker/Gizmodo and its affiliate sites are down too



We (Shapeways) are also down.


Are they unable to bypass the destroyed pumps and take fuel to the generators via 55-gallon drums? If so, why not?


"The building itself is being evacuated and no remote hands support will be available to assist in any equipment shutdown. Life safety is our number one priority and we are making plans to completely exit the facility."

Uptime that requires people to risk their lives isn't worth asking for.


At the point that a customer notification includes the words "Life safety is our number one priority" things like "can't they bypass the pumps" become far less significant.

People's lives are at stake. If the service is important enough that you'd expect them to stay behind and keep it running for you, it is probably important enough that it exists in multiple, geographically separated data centres.


The building is being evacuated.


I wonder what the chances are that the water will go down, and they will be able to supply fuel to the pumps, before power is actually lost.


Zero. Their location got flooded with a 12 foot storm surge. I doubt the police are letting anyone near that area. I'm up on 96th and Park. We've had flooding and power outages along the river inland to 2nd ave up here as well. Also, there is practically no way in or out of the island of Manhattan now. All bridges and tunnels are closed except for _possibly_ the Lincoln tunnel.


So why didn't they do it earlier? That was my point. Of course you have to keep your people safe, but this shows poor planning.


How long would some <X> 55-gallon drums full of fuel have helped, anyway? Most cars can easily run through 55 gallons of fuel in a matter of hours (<10), much less a datacenter!

There well may have been poor planning at some point, but these things happen so infrequently there must be an allowance for this-is-a-disaster that cannot easily be worked around.

Another question is for the customers: are they running all their services from a single datacenter? Sounds like it shows poor planning.


It is an easy question to answer, e.g. http://www.pmsi-inc.com/pdf/GeneracPowerSystem.pdf shows 1MW generator uses about 63 gallons per hour, just a little more than 1 drum.

I don't know the size of the facility, so I can't tell how many drums it would take, per hour, to keep the place running.


So, now they're expected to move around an unknown quantity of 400-500 lb 55-gallon drums of fuel? And where would they have sourced these so easily and quickly (another unknown, I would think)?

We don't know how much energy is being used nor how much fuel is required to run the generator(s) per hour, among other things. That seems like a lot of important information that's missing to call this situation 'poor planning.'

But, I'll respond to your single datapoint with mine: according to [1] that's 2 full drums for every one hour of generator running.

My point isn't to be simple argumentative but to look at things from a more appropriate perspective. Generators in these circumstances (and my professional experience) are not meant for very long periods or indefinite usage.

I'm filing this proposed drum-filling plan into the 'unrealistic' category.

1) http://datacentersmadesimple.com/tech_highlights.html


Try to see it from a different perspective: customers are paying (my guess) $800 per rack, per month, plus power and bandwidth charges; and a rack takes up 30 square feet once you include the space around that rack.

So almost $30 per square foot for just the raw space alone. $360 per square foot per year is a high rent, even for NYC.

And what is the client supposed to get for his money? Reliability! The engineering and facilities management expertise to ensure this, is baked into the costs.

You ask, "so now they are expected to move 55 gallon drums of fuel"? ABSOLUTELY they are expected to do that. The only "appropriate perspective" is that the clients are paying a lot of money for the datacenter to do whatever needs to be done.

They had a week of warning to source these; they already have a long-standing relationship with their fuel supplier for diesel delivery, so they call him up and say "Joe, we need 20 drums of diesel in addition to topping up the tanks we have" and they arrive in the next 2 days.

These diesel generators are basically modified / tuned versions of a big truck or marine diesel, which has a rebuild interval of 500K to 1 million miles if used as a truck engine or some high number of operating hours (like 10,000 hours). Perhaps you are thinking of LPG, natgas or gasoline powered gensets, which are designed for less frequent use.

I researched all aspects of building a DC years ago and realized that even if I could raise the $5 million to do an entry level one, my effort was best spent elsewhere.

Customers punish downtime, this DC will lose clients, be sure of it.

Aside: there was a guy in New Orleans who kept his DC running all through Hurricane Katrina and after it - if you search the site at http://mgno.com with terms like "diesel drums" you will find his old posts. Can't seem to easily link to these old posts, though.


I'm well aware of these generators with two shipping-container sized units right outside my building, tested fequently. So, I concede and simply disagree.

I agree they will likely lose some customers but I disagree that there was too much they can do now. Were they in the mandatory evacuation zone? (I don't know) Will 20 drums (10-ish hours) really help if this is a multi-day outage? Did the customers plan for a failover to another datacenter, or put all their eggs in one basket? (oops!)


Storm surge was higher than expected.


Why would any sane manager take the risk:

- for a bunch of employees to start filling up 55-gallon drums of easily-ignited fuel

- hustle the 55-gallon barrels over to somewhere else in a datacenter

- then hook it up to a generator and restart that.

- under an evacuation warning before any of this happened


Only rackspace have fanatical support. They would lay down their lives for 1 more minute of uptime.



So they have more sense than DirectNIC had during Katrina http://www.wired.com/science/discoveries/news/2005/09/68725


Internap LGA11 lost power at 11:48AM ET http://pastebin.com/6AxvbzF1


Some websites affected by catastrophe: OccupyWallSt.org, Alternet.org


Internap's cloud service taken down by Sandy's cloud service.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: