Discussion Forums



Thread: Unresponsive EC2 instance

This question is answered. Helpful answers available: 1. Correct answers available: 1.

Welcome, Guest Help
Login Login


Permlink Replies: 36 - Pages: 3 [ Previous | 1 2 3 | Next ] - Last Post: Jun 11, 2009 6:29 PM by: JoeJ@AWS
msyst

Posts: 6
Registered: 9/18/08
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 7:44 PM PDT   in response to: onesunnyguy
 
  Click to reply to this thread Reply

Seems to be affecting at least two zones, us-east-1a, and us-east-1b.  All three of my instances that are not responsive are in us-east-1a.

i-7abf0f13
i-58600b31
i-74d6661d


rtdev

Posts: 147
Registered: 3/28/08
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 7:45 PM PDT   in response to: filife
 
  Click to reply to this thread Reply

Just a guess but if I recall correctly one availability zone for ec2 customer 1 can be another availability zone for customer 2.  So my us-east-1b can be the same as your us-east-1c.  I think they are distinct within the same account - meaning if I create an instance in 1b and 1c they will be in different zones from each other, but for other accounts 1b maps to something else.


Ramkumar@AWS

Posts: 515
Registered: 9/20/07
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 7:46 PM PDT   in response to: pjanakiraman
 
  Click to reply to this thread Reply

We have posted to the service health dashboard answering many of these questions. Please refer there for more information. We are working hard to restore the affected instances that are all in a single AZ.


Eric Hammond
RealName(TM)


Posts: 1,134
Registered: 7/7/07
Re: Availability zones per account
Posted: Jun 10, 2009 7:47 PM PDT   in response to: filife
 
  Click to reply to this thread Reply

The availability zone "us-east-1a" for one EC2 account could be the same availability zone as "us-east-1b" in another EC2 account and "us-east-1c" in a third account.  Amazon spreads these out so that if everybody runs primarily in their own "us-east-1a" it doesn't overload a single availability zone.



rtdev

Posts: 147
Registered: 3/28/08
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 7:51 PM PDT   in response to: pjanakiraman
 
  Click to reply to this thread Reply

This was just posted in the healthdash board as of 9:33 PDT - I have a follow up question after the quote below:
----
We wanted to give you a quick update. A lightning storm caused damageto a single Power Distribution Unit (PDU) in a single AvailabilityZone. While most instances were unaffected, a set of racks does notcurrently have power, so the instances on those racks are down. We havetechnicians on site, and we are working to replace the affected PDU. Wedo not yet have an ETA, but we expect to be able to recover theinstances when we restore power. Besides these affected instances, allother instances, and all other Availability Zones, are operatingnormally. Users with affected instances can launch replacementinstances in any of the US Region Availability Zones or wait untiltheir instance(s) are restored.
----

Am I reading this correctly that they anticipate that once they restore power, the instances should be back online without data loss?

That would be great, but what I don't understand is this - if the instances are without power, then how can we expect they will be restored without loss?  I have data on /mnt on one of these boxes I need to recover.  Should I start planning now for that to be lost or might it still be there when they restore connectivity??


filife

Posts: 71
Registered: 7/17/08
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 7:54 PM PDT   in response to: pjanakiraman
 
  Click to reply to this thread Reply

Ramkumar,

Out of curiosity, why would damage to a single PDU cause a rack to lose power? Proper data center racks have AB power, having two distinct power feeds from separate PDUs.  Why would a failure of one PDU cause a whole rack to fail?
The only scenario I could think of is that the rack is overloaded and when one PDU fails the other PDU can't handle the amperage draw and it as well fails.

I hope that's not the case, but without more information I'm only left to draw conclusions.


msyst

Posts: 6
Registered: 9/18/08
Re: Availability zones per account
Posted: Jun 10, 2009 7:55 PM PDT   in response to: Eric Hammond
 
  Click to reply to this thread Reply

------------
The availability zone "us-east-1a" for one EC2 account could be thesame availability zone as "us-east-1b" in another EC2 account and"us-east-1c" in a third account.  Amazon spreads these out so that ifeverybody runs primarily in their own "us-east-1a" it doesn't overloada single availability zone.
------------

Wasn't aware of that, but definitely makes sense as I'm sure a lot of people default to using 1a.  Learn something new every day. :)
Message was edited by: msyst

yujb

Posts: 35
Registered: 3/24/09
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 8:02 PM PDT   in response to: Ramkumar@AWS
 
  Click to reply to this thread Reply

btw the "Report an Issue" link in the http://status.aws.amazon.com/ points to a non-existent page.


barney1982

Posts: 13
Registered: 6/10/09
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 8:10 PM PDT   in response to: rtdev
 
  Click to reply to this thread Reply

rtdev wrote:
Am I reading this correctly that they anticipate that once they restore power, the instances should be back online without data loss?

That would be great, but what I don't understand is this - if the instances are without power, then how can we expect they will be restored without loss?  I have data on /mnt on one of these boxes I need to recover.  Should I start planning now for that to be lost or might it still be there when they restore connectivity??


Same concern here, I don't understand how the instance can be restored without data loss. If the data can be restored then I'm definitely going to wait for it. If not, please let us know so that we can start working on new instance.

C. Oliver
RealName(TM)


Posts: 36
Registered: 4/22/07
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 8:14 PM PDT   in response to: pjanakiraman
 
  Click to reply to this thread Reply

I got hit by this too, I also dont think Amazon is telling the truth that it was just one rack.



ivangrosny

Posts: 8
Registered: 6/10/09
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 8:20 PM PDT   in response to: C. Oliver
 
  Click to reply to this thread Reply

I got this too
i-d92664b0.

any ideas?


rtdev

Posts: 147
Registered: 3/28/08
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 8:22 PM PDT   in response to: barney1982
 
  Click to reply to this thread Reply

I am hoping it is the equipment that handles the connectivity that lost the power, and that the instances are sitting behind that waiting for connectivity to be restored.  From the terse description in the dashboard I can't tell if that's the case. However if the instances themselves lost power, then although they would be relaunched any data on /mnt would be gone, I suppose.


C. Oliver
RealName(TM)


Posts: 36
Registered: 4/22/07
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 8:26 PM PDT   in response to: rtdev
 
  Click to reply to this thread Reply

I'm going to guess we lost data....Seems odd to just let the PDU control the network.

filife

Posts: 71
Registered: 7/17/08
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 8:36 PM PDT   in response to: pjanakiraman
 
  Click to reply to this thread Reply

it's been over an hour since the last update.  Can we get an idea of where amazon is with the issue?

Do I need to spawn up more instances?

Thanks

Chris


msyst

Posts: 6
Registered: 9/18/08
Re: Unresponsive EC2 instance
Posted: Jun 10, 2009 8:55 PM PDT   in response to: filife
 
  Click to reply to this thread Reply

8:43 PM PDT  We are in the process ofrestoring power to the affected instances. We expect recovery to startwithin half an hour. We also identified and corrected a related issuethat caused "Describe-*" calls to return non-current data.



Point your RSS reader here for a feed of the latest messages in all forums