Jump to content


Photo

[Resolved] Cypress Unexpected Downtime


  • Please log in to reply
5 replies to this topic

#1 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,901 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 11 March 2011 - 03:16 PM

We're aware that the Cypress server is offline and we're working on this. We'll post more information once we have it however the priority right now is returning full service.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#2 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,901 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 11 March 2011 - 04:00 PM

The server was rebooted within seconds of the alerts from our internal monitoring and took about 8 minutes to get back up to speed and we're still investigating the cause for the outage and will update you.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#3 Brian Stevenson

Brian Stevenson

    Newbie

  • Members
  • Pip
  • 17 posts

Posted 11 March 2011 - 04:09 PM

The server was rebooted within seconds of the alerts from our internal monitoring and took about 8 minutes to get back up to speed and we're still investigating the cause for the outage and will update you.

The network graph is particularly interesting:
http://www.mddhostin...erverstatus.php
Posted Image
  • 0

#4 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,901 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 11 March 2011 - 04:16 PM

The network graph is particularly interesting:
http://www.mddhostin...erverstatus.php
Posted Image

It's not entirely accurate at all times and is only there just for those who are curious and shouldn't be used to diagnose issues :) Just see this graph of our network for example over the same period of time:
Posted Image
We're working on making these nicer more-accurate graphs available on the public side of things, but I can't promise when or if that will happen.

As far as the crash - the server went from having around 8 GB of RAM free (which is a lot, more than a lot of providers have total in their servers) to 0 and the server started killing processes to get some free ram back. This issue happend so quickly that the logging stopped before anything useful could be written to the disks to diagnose this and we were forced to perform a reboot.

We're setting up some additional internal monitoring on a very fast interval (something like 5 seconds) for the next 48 hours so if it happens again we will have some useful information to diagnose what happened.

At this point the server is back online and we're going to mark this as closed. If you have any questions at all, do feel free to ask them.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#5 Brian Stevenson

Brian Stevenson

    Newbie

  • Members
  • Pip
  • 17 posts

Posted 11 March 2011 - 04:46 PM

It's not entirely accurate at all times and is only there just for those who are curious and shouldn't be used to diagnose issues :) Just see this graph of our network for example over the same period of time:
<< Edited out image, scroll up to see it in the quoted post >>
We're working on making these nicer more-accurate graphs available on the public side of things, but I can't promise when or if that will happen.

As far as the crash - the server went from having around 8 GB of RAM free (which is a lot, more than a lot of providers have total in their servers) to 0 and the server started killing processes to get some free ram back. This issue happend so quickly that the logging stopped before anything useful could be written to the disks to diagnose this and we were forced to perform a reboot.

We're setting up some additional internal monitoring on a very fast interval (something like 5 seconds) for the next 48 hours so if it happens again we will have some useful information to diagnose what happened.

At this point the server is back online and we're going to mark this as closed. If you have any questions at all, do feel free to ask them.


Hmm.. maybe cacti.mddhosting.com and cypress.mddhosting.com are the same box? If logging stopped, that would explain why the the graph shows some dead air during the outage.

Thanks for digging into this one Michael.
  • 0

#6 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,901 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 11 March 2011 - 04:48 PM

Hmm.. maybe cacti.mddhosting.com and cypress.mddhosting.com are the same box? If logging stopped, that would explain why the the graph shows some dead air during the outage.

Thanks for digging into this one Michael.

No, they're not on the same server - Cacti just seems to not be very reliable as we have it configured. Both Cacti and the graph I included are hitting the switch for details on the traffic so the information source is the same.

It's not used for us, it's just to display a nice graph for the customers for those who want to see the traffic just out of curiosity. We may pull the graph off of that page until we can get a more reliable solution in place.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users