Jump to content


Photo

[Completed] Fresh Backups of Echo / Cypress / Fresco


  • Please log in to reply
33 replies to this topic

#1 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 02 April 2011 - 11:32 AM

Us keeping backups of the servers and your data, in our opinion, is extremely important for the protection of your data against potential hardware failure and other forms of data loss and corruption. We experienced some issues with our backup server yesterday afternoon during scheduled maintenance on the backup system that is going to force us to take fresh backups of all servers to protect your data.

This fresh backup will need to copy every single bit of data off of each server to the backup server which is a very disk intensive process and will certainly cause some performance hits tonight while the processes run. We weighed the options between putting this off for a week or two due to the intermittent issues we've had over the last couple of weeks however it's a risk that we simply do not wish to take.

The process on each server will begin at these times:
Cypress Server - 10:30 PM EST (GMT-5) (Estimated 8 hour run-time)
Echo Server - 11:30 PM EST (GMT-5) (Estimated 6 hour run-time)
Fresco Server - 12:30 AM EST (GMT-5) (Estimated 6 hour run time)

If we do not run this process tonight the only other viable option would be waiting 7 days until next Sunday to run the process which would leave your data unprotected and not backed up for a period of a solid week where any hardware failure or other major system issue could cause your data to be lost without resource.

We understand any frustration this may cause and we apologize. We hope that the backup process impact will be minimal due to the server usage levels being their lowest on Sunday evenings and we expect the process to be finished by the start of normal business hours in the US on Monday morning.

We do realize that not all of our customers are based out of the United States and that this backup process may cause issues for some of our clients whose peak times are opposite those in the US however ultimately we need to perform these backups when the overall server usage is at it's lowest and that time is during the off-peak time for the US.

If you have any questions, let us know.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#2 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 02 April 2011 - 12:03 PM

Here is an exact copy of the email message that has been dispatched to all customers.

Over the past couple of weeks we've had some performance issues with our servers mostly during the over-night period due to a couple of factors. I want to detail why these issues have happened, what we're doing to resolve them, as well as going over some scheduled maintenance for this evening. We always do our best to be as transparent and honest as possible and as such we're not going to keep any details from you.

As you may already know, we run CloudLinux on our servers to help keep things stable and to prevent the over-usage of some accounts affecting everybody else. The system is generally very solid and does a wonderful job of doing what it's supposed to do. Recently we, at the advice of CloudLinux, upgraded to a newer version of the system that was supposed to offer increased performance and reliability when in fact it turns out that we ran into a couple of serious issues with the new version of the software that were exasperated by our R1Soft backup system which can be very intensive all on it's own.

During normal use the new version of CloudLinux we were running on the servers performed well and did indeed give some performance gains especially when it came to hard drive access speeds on the server. The issues arose when our R1Soft backup system was run to back up your data and protect you from hardware failure and other data loss causes. This backup process tends to be very intensive as it needs to scan every bit on the disk to look for and record any changes to the data made to keep an up-to-date backup of your data. When this backup process ran, without any warning the servers locked up forcing us to not only reboot the systems but to also abort the backup process that had already been running for an hour or more. We have downgraded our CloudLinux installations across our servers back to a version we operated with for a very long time without issues and we don't anticipate having any major issues with the backup systems from this point forward.

We have always kept backups of our servers and we tend to use those backups fairly regularly to fix issues for customers such as when somebody accidentally deletes a file or drops the wrong database or database table. This process due to being very intensive does cause a few minutes of extremely slow performance every night when it runs however this usually clears up in 2 to 3 minutes. We realize that this can be annoying at best however it's the cost of keeping up to date off-server copies of all data and databases.

Due to all of the issues we've faced while performing backups over the last week we feel that we cannot rely upon the quality of the current backups stored of the systems and as such we need to perform fresh backups of all servers to ensure that if the need does arise to use the backups to restore data, that the data restored is accurate and problem free. We evaluated our options as far as when to run these fresh backups to cause the least impact in service performance for our customers and we determined that Sunday nights are the best times to do such intensive processes. The next decision was whether we should wait a week or two to perform this fresh backup or to go ahead and do the backup as soon as possible.

We have chosen to perform the seed backups tonight as we feel a week without reliable backups is far too long as hardware failure is not something that can be expected or planned for. We do run redundant disk arrays in our servers to help protect against drive failure (up to 2 drives can fail per server without data loss) however running redundant arrays is not a substitute for reliable daily backups of the data. This process we expect to take between 6 and 8 hours per server and once we've performed this fresh backup the backup system should not cause any major performance degradation in the future beyond the expected two to three minutes per night that it takes for the backup system to spin up per server and get started.

You may be frustrated with the issues we've faced on a couple of our servers over the last two weeks and we're right there with you on that frustration. We have posted a thread on the forum with the information contained from this email as well as some additional details about the backup process for this evening. If you you have any direct questions you would like to ask you are welcome to respond to this email or to visit the forums and publicly post your thoughts / questions / comments / suggestions as well.

Here is a link to the forum thread, for your convenience:
http://forums.mddhos...cypress-fresco/

Thank you,


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#3 fshagan

fshagan

    Member

  • Members
  • PipPip
  • 145 posts

Posted 02 April 2011 - 12:16 PM

Great advance notice and detailed explanation! I like the fact that we know what's coming up, and why when possible.

For those of us with a VPS on a different server (ie Atlantis), I'm assuming the above problems with cloud linux and the slight possibility that the backups could be corrupted doesn't apply?
  • 0

#4 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 02 April 2011 - 12:19 PM

That is correct, our VPS servers are being backed up by R1Soft nightly and have not experienced any issues at all with backup integrity. We hope that this is the last time we have to do a fresh seed backup for quite some time. Tonight should be low enough usage across the servers that the backup process shouldn't cause any issues but ultimately we wanted to give notice just in case it does cause any performance issues.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#5 forumite

forumite

    Newbie

  • Members
  • Pip
  • 6 posts

Posted 02 April 2011 - 01:12 PM

Another big thanks for MDD being proactive in taking action.

One minor clarification - both messages and the email mentioned Sunday night, but they also say 'tonight', which of course is Saturday in the US. Just need to let my users know which night.

Thanks,

Tom
  • 0

#6 kittybabylove

kittybabylove

    Newbie

  • Members
  • Pip
  • 1 posts

Posted 02 April 2011 - 02:38 PM

Thanks for keeping us in the loop! All the info and the proactive decision-making are much appreciated.
  • 0

#7 Akara

Akara

    Newbie

  • Members
  • Pip
  • 1 posts

Posted 02 April 2011 - 02:52 PM

Thanks for letting us know what is going on. My site has been slow the last couple days and I was wondering what was going on. Will we will be able to edit our sites during this time or advised not to?

Also, is it today 4/2 or 4/3? It's only Saturday for me hehe
  • 0

#8 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 02 April 2011 - 04:59 PM

You should be fine to edit your sites, and today is Saturday, April 2nd, 2011.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#9 Da Chef

Da Chef

    Newbie

  • Members
  • Pip
  • 8 posts

Posted 02 April 2011 - 08:48 PM

MDDhosting, always stepping ahead and never behind when it comes to service and proactive technical actions, as to these recent and current issues of server lag, etc.
Thank you MDDhosting for such a wonderful approach to our company and all of your hosted customers as well. It sounds like there are many of us out there that are happy with your services.
  • 0

#10 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 02 April 2011 - 10:58 PM

The backups are running on Cypress and Echo (and will start on Fresco in about a half hour) and so far so good with no issues whatsoever.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#11 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 02 April 2011 - 11:07 PM

Cypress did just now go unresponsive for a minute however it was due to the backup system doing the intensive first backup + a user gzipping an extremely large file. We've cancelled the gzip temporarily and the server will take a minute or two to catch back up and get back to normal.

We are watching the servers closely to try and make this as smooth as possible.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#12 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 03 April 2011 - 03:27 AM

Cypress has run into some issues with disk I/O during the full backup due to user-generated backups. We've temporarily been suspending those processes while this full backup runs.

Echo has finished and Fresco will be done in about 2 hours. Cypress looks to be taking quite a bit longer due to the large amount of data to be backed up (around 1.2 TB).
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#13 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 03 April 2011 - 03:39 AM

As we've been watching Cypress we see that the log processing on the server as well as the daily system crons were previously interfering with the backup process. Tonight we stopped them manually while this process runs to make the server responsive again and will manually process these later today once the backup process is finished.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#14 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 03 April 2011 - 01:30 PM

I just checked and everything finished a while ago so we're going to mark this as resolved :)
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#15 ender

ender

    Newbie

  • Members
  • Pip
  • 8 posts

Posted 03 April 2011 - 11:02 PM

I just checked and everything finished a while ago so we're going to mark this as resolved :)

if everything is finished, why the fresco server still reboot and slow?
  • 0

#16 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 03 April 2011 - 11:04 PM

if everything is finished, why the fresco server still reboot and slow?

The backup process runs every night and does cause some slowness while it gets up to speed. If you're experiencing an issue, open a ticket.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#17 Brad

Brad

    Member

  • Members
  • PipPip
  • 29 posts

Posted 04 April 2011 - 01:24 PM

Just a note, because it might be related seeing as this is the most recent action taken on the server I'm on. We've been extraordinarily slow for about an hour now. Cypress server. 8+ seconds to load a page on my sites.

P.s.
Ticket sent.
  • 0

#18 Ilan

Ilan

    Newbie

  • Members
  • Pip
  • 10 posts

Posted 04 April 2011 - 01:26 PM

I'm surprised you're able to access your site. For me, my site doesn't even load.
  • 0
DMCTalk.com - DeLorean forum for owners and enthusiasts

#19 Brian Stevenson

Brian Stevenson

    Newbie

  • Members
  • Pip
  • 17 posts

Posted 04 April 2011 - 01:26 PM

Just a note, because it might be related seeing as this is the most recent action taken on the server I'm on. We've been extraordinarily slow for about an hour now. Cypress server. 8+ seconds to load a page on my sites.

P.s.
Ticket sent.

I can confirm that my site on cypress is experiencing the same issue. CloudFlare cache is kicking in b/c my site is timing out.
  • 0

#20 Brian Stevenson

Brian Stevenson

    Newbie

  • Members
  • Pip
  • 17 posts

Posted 04 April 2011 - 01:29 PM

I'm surprised you're able to access your site. For me, my site doesn't even load.

The current server load doesn't appear bad. Must be an I/O issue?:
Server Load 7.13 (16 cpus)
Memory Used 45.9 %
Swap Used 0.61 %
  • 0




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users