|
Discussion Forums
|
Thread: Massive (500) Internal Server Error.outage started 35 minutes ago
|
|
Posts:
21
Registered:
10/4/07
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:19 AM PST
in response to: A. Barbieri
|
|
|
Echoing the previous comments:
- outage started around 7:25 EST
- 404 when accessing login page for forum
- initially I could not connect to EC2 or S3, at all (100% failure)
- EC2 started working again just before 9:00 EST
- S3 is still 100% failure
An ETA would be helpful, but it's understandable that it could take a little bit to determine the problem first before providing one. Please just keep us posted once you have new info.
Also, emergency contact information needs to be provided. If these forums are our only way of contacting support, and they stop working due to a failure (like this morning), how can we notify you? Errors happen, but there MUST be a fail-safe way of reporting them.
|
|
Posts:
5
Registered:
11/5/07
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:20 AM PST
in response to: maxcc
|
|
|
Dears,
I´m really afraid to use S3 service now. My company choose to work with Amazon because of it´s reliability. We host more than 30,000 images from the number 2 TV station in Brazil. Now we are having several problems because of this S3 issues.
We should send this error to blogs and newspapers. Everyone should know what´s going on with Amazon´s Web Services.
Hope we have news soon!
Gustavo
|
|
Posts:
25
Registered:
1/9/07
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:21 AM PST
in response to: D. Snyder
|
|
|
The whole point of S3 is that it is distributed, redundant, and backed by an SLA. Yes it would be a good idea to have a backup solution, but that's like saying I need to back up my data with three different storage providers. With uptime it's all about figuring out (a) what the expected uptime of a service is and (b) what redundancy you have to have to make your uptime as high as you need it given that.
That being said, I also have had only minimal issues with S3 until now - which is why we made that "expected uptime" so high. We'll have to re-evaluate that.
|
|
Posts:
10
Registered:
1/15/07
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:21 AM PST
in response to: beanie4242
|
|
|
The errors are alternating between:
Http/1.1 Service Unavailable
and a full
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message><RequestId>4DA9AB18343E90DA</RequestId><HostId>mik6GeTv+NChd4rFfpawsaaASfTZOdXN2FlKw5kcoJk61NnIKgpG+RaJwqUBDrUh</HostId></Error>
Don't know if that helps, but yes, this issue is really filling up our support queue with emails and complaints
|
|
Posts:
5
Registered:
1/11/08
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:24 AM PST
in response to: maxcc
|
|
|
Hi,
we are experiecing the same issue ...
for example, trying to access
the key: 1936.jpg
in the bucket: buddies.koinup.com
we get this error:
Error>
<Code>InternalError</Code>
<Message> We encountered an internal error. Please try again.
</Message>
<RequestId>41786F29ED914C93</RequestId>
<HostId> vWM+6yPsvhu+7ppNGiDqVNGmm2v8mxg6yoIh7rIyJ21VjWMrka9rS8xMi76a6KDj
</HostId>
</Error>
the same happens for every key in the bucket.....
storage.koinup.com
buddies.koinup.com
some minutes ago the previous error changed in:
Http/1.1 Service Unavailable
postscript: for about 45 minutes i wasn't able to login into the forum
so i have an other delay with the communication
now is more than on hour with the issue
BTW, we faced other different issues in the last weeks.....as other users are saying
if you have Amazon, you should have also an other service for back up
this is a problem....
hope you fix this issue quickly
Message was edited by: techup
|
|
Posts:
10
Registered:
5/1/07
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:25 AM PST
in response to: maxcc
|
|
|
While I'm surprised this kind of error is possible, a big thanks to Amazon for getting onto this so quickly.
I've had 100% success rate on GET requests for about 20 mins (although all other requests still seem to be failing).
|
|
Posts:
3
Registered:
12/10/07
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:26 AM PST
in response to: maxcc
|
|
|
While S3 Europe is happily trotting along, indeed S3 US is completely down.
Which also means our site is completely down.
The second mayor outage in about a month, although last time it was a DNS issue right outside the S3 infrastructure.
I won't go as far as to say I'll stop using S3 as it's proven very reliable in the last 4 months and has allowed us to handle peak volume a lot better than in the past...
Nevertheless, it's clear that emergency scenarios need to be investigated.
How about AWS developing "replicating buckets" between EU and US?
A bit of load-balancing + error-correcting DNS on top and we've got a world class solution... and honestly, I wouldn't mind at all paying for the bandwidth usage between EU and US to replicate.
Heck if AWS adds loadbalancing DNS to S3, I'd be happy to do my own replication.
|
|
Posts:
1
Registered:
2/15/08
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:26 AM PST
in response to: maxcc
|
|
|
Same issue here, its going now for the 3rd hour completely without service.
Sadly, that is not the first time we see those kind of failures, although past downtimes where less than 10 minutes.
|
|
Posts:
29
Registered:
1/10/07
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:28 AM PST
in response to: A. Barbieri
|
|
|
On a positive note, 30min ago I could not even post to this thread so it looks like work is being done. Hope it does not take too much longer.
|
|
Posts:
63
Registered:
7/7/06
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:32 AM PST
in response to: thatsmymouse
|
|
|
lucky you...
3 hours into the disruption and the 'ERROR 500: Internal Server Error.' has now become 'ERROR 503: Service Unavailable.'
|
|
Posts:
110
Registered:
4/24/06
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:32 AM PST
in response to: maxcc
|
|
|
Although it would be nice to get an update from Amazon on the issue, it does appear that some progress is being made. As a few others have mentioned you can now log into this forum, which wasn't working before. Also, the FPS service (which was also affected) has now started accepting web service requests again.
I'm guessing that whatever the issue is, it's tied to authentication which explains why it's affecting all the AWS services. Hopefully this will spur Amazon to add additional redundancy to the authentication system which appears to be a massive single point of failure right now.
|
|
Posts:
6
Registered:
1/16/07
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:36 AM PST
in response to: Ian Connor
|
|
|
We're using S3 as our single storage for 250.000 of images and growing ...and we're not the only one (for example 37signals Basecamp).
Very hard to decide if a second backup is necessary. I agree that the whole point of S3 is that it is distributed, redundant, and backedby an SLA. Why then need a fail-safe.
Hope all is solved quickly.
|
|
Posts:
5
Registered:
11/5/07
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:40 AM PST
in response to: maxcc
|
|
|
News from Amazon?? I need to say something to my clients.
Message was edited by: gcaetano
|
|
Posts:
2
Registered:
2/15/08
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:40 AM PST
in response to: A. Barbieri
|
|
|
Why amazon don't move all the traffic over the EU Network then ?
This one is working as I can see....
|
|
Posts:
31
Registered:
1/13/07
|
|
|
|
Re: Massive (500) Internal Server Error.outage started 35 minutes ago
Posted:
Feb 15, 2008 6:48 AM PST
in response to: artionet
|
|
|
It would be good to have an update on progress so we can pass this on to our clients.
|
|
|
|