
[Completed] Fresh Backups of Echo / Cypress / Fresco
#1
Posted 02 April 2011 - 11:32 AM
This fresh backup will need to copy every single bit of data off of each server to the backup server which is a very disk intensive process and will certainly cause some performance hits tonight while the processes run. We weighed the options between putting this off for a week or two due to the intermittent issues we've had over the last couple of weeks however it's a risk that we simply do not wish to take.
The process on each server will begin at these times:
Cypress Server - 10:30 PM EST (GMT-5) (Estimated 8 hour run-time)
Echo Server - 11:30 PM EST (GMT-5) (Estimated 6 hour run-time)
Fresco Server - 12:30 AM EST (GMT-5) (Estimated 6 hour run time)
If we do not run this process tonight the only other viable option would be waiting 7 days until next Sunday to run the process which would leave your data unprotected and not backed up for a period of a solid week where any hardware failure or other major system issue could cause your data to be lost without resource.
We understand any frustration this may cause and we apologize. We hope that the backup process impact will be minimal due to the server usage levels being their lowest on Sunday evenings and we expect the process to be finished by the start of normal business hours in the US on Monday morning.
We do realize that not all of our customers are based out of the United States and that this backup process may cause issues for some of our clients whose peak times are opposite those in the US however ultimately we need to perform these backups when the overall server usage is at it's lowest and that time is during the off-peak time for the US.
If you have any questions, let us know.
█ Scalable shared hosting plans in the cloud! Check them out!
█ Highly Available Cloud Shared, Reseller, and VPS
█ http://www.mddhosting.com/
#2
Posted 02 April 2011 - 12:03 PM
Over the past couple of weeks we've had some performance issues with our servers mostly during the over-night period due to a couple of factors. I want to detail why these issues have happened, what we're doing to resolve them, as well as going over some scheduled maintenance for this evening. We always do our best to be as transparent and honest as possible and as such we're not going to keep any details from you.
As you may already know, we run CloudLinux on our servers to help keep things stable and to prevent the over-usage of some accounts affecting everybody else. The system is generally very solid and does a wonderful job of doing what it's supposed to do. Recently we, at the advice of CloudLinux, upgraded to a newer version of the system that was supposed to offer increased performance and reliability when in fact it turns out that we ran into a couple of serious issues with the new version of the software that were exasperated by our R1Soft backup system which can be very intensive all on it's own.
During normal use the new version of CloudLinux we were running on the servers performed well and did indeed give some performance gains especially when it came to hard drive access speeds on the server. The issues arose when our R1Soft backup system was run to back up your data and protect you from hardware failure and other data loss causes. This backup process tends to be very intensive as it needs to scan every bit on the disk to look for and record any changes to the data made to keep an up-to-date backup of your data. When this backup process ran, without any warning the servers locked up forcing us to not only reboot the systems but to also abort the backup process that had already been running for an hour or more. We have downgraded our CloudLinux installations across our servers back to a version we operated with for a very long time without issues and we don't anticipate having any major issues with the backup systems from this point forward.
We have always kept backups of our servers and we tend to use those backups fairly regularly to fix issues for customers such as when somebody accidentally deletes a file or drops the wrong database or database table. This process due to being very intensive does cause a few minutes of extremely slow performance every night when it runs however this usually clears up in 2 to 3 minutes. We realize that this can be annoying at best however it's the cost of keeping up to date off-server copies of all data and databases.
Due to all of the issues we've faced while performing backups over the last week we feel that we cannot rely upon the quality of the current backups stored of the systems and as such we need to perform fresh backups of all servers to ensure that if the need does arise to use the backups to restore data, that the data restored is accurate and problem free. We evaluated our options as far as when to run these fresh backups to cause the least impact in service performance for our customers and we determined that Sunday nights are the best times to do such intensive processes. The next decision was whether we should wait a week or two to perform this fresh backup or to go ahead and do the backup as soon as possible.
We have chosen to perform the seed backups tonight as we feel a week without reliable backups is far too long as hardware failure is not something that can be expected or planned for. We do run redundant disk arrays in our servers to help protect against drive failure (up to 2 drives can fail per server without data loss) however running redundant arrays is not a substitute for reliable daily backups of the data. This process we expect to take between 6 and 8 hours per server and once we've performed this fresh backup the backup system should not cause any major performance degradation in the future beyond the expected two to three minutes per night that it takes for the backup system to spin up per server and get started.
You may be frustrated with the issues we've faced on a couple of our servers over the last two weeks and we're right there with you on that frustration. We have posted a thread on the forum with the information contained from this email as well as some additional details about the backup process for this evening. If you you have any direct questions you would like to ask you are welcome to respond to this email or to visit the forums and publicly post your thoughts / questions / comments / suggestions as well.
Here is a link to the forum thread, for your convenience:
http://forums.mddhos...cypress-fresco/
Thank you,
█ Scalable shared hosting plans in the cloud! Check them out!
█ Highly Available Cloud Shared, Reseller, and VPS
█ http://www.mddhosting.com/
#3
Posted 02 April 2011 - 12:16 PM
For those of us with a VPS on a different server (ie Atlantis), I'm assuming the above problems with cloud linux and the slight possibility that the backups could be corrupted doesn't apply?
#4
Posted 02 April 2011 - 12:19 PM
█ Scalable shared hosting plans in the cloud! Check them out!
█ Highly Available Cloud Shared, Reseller, and VPS
█ http://www.mddhosting.com/
#5
Posted 02 April 2011 - 01:12 PM
One minor clarification - both messages and the email mentioned Sunday night, but they also say 'tonight', which of course is Saturday in the US. Just need to let my users know which night.
Thanks,
Tom
#6
Posted 02 April 2011 - 02:38 PM
#7
Posted 02 April 2011 - 02:52 PM
Also, is it today 4/2 or 4/3? It's only Saturday for me hehe
#8
Posted 02 April 2011 - 04:59 PM
█ Scalable shared hosting plans in the cloud! Check them out!
█ Highly Available Cloud Shared, Reseller, and VPS
█ http://www.mddhosting.com/
#9
Posted 02 April 2011 - 08:48 PM
Thank you MDDhosting for such a wonderful approach to our company and all of your hosted customers as well. It sounds like there are many of us out there that are happy with your services.
#10
Posted 02 April 2011 - 10:58 PM
█ Scalable shared hosting plans in the cloud! Check them out!
█ Highly Available Cloud Shared, Reseller, and VPS
█ http://www.mddhosting.com/
#11
Posted 02 April 2011 - 11:07 PM
We are watching the servers closely to try and make this as smooth as possible.
█ Scalable shared hosting plans in the cloud! Check them out!
█ Highly Available Cloud Shared, Reseller, and VPS
█ http://www.mddhosting.com/
#12
Posted 03 April 2011 - 03:27 AM
Echo has finished and Fresco will be done in about 2 hours. Cypress looks to be taking quite a bit longer due to the large amount of data to be backed up (around 1.2 TB).
█ Scalable shared hosting plans in the cloud! Check them out!
█ Highly Available Cloud Shared, Reseller, and VPS
█ http://www.mddhosting.com/
#13
Posted 03 April 2011 - 03:39 AM
█ Scalable shared hosting plans in the cloud! Check them out!
█ Highly Available Cloud Shared, Reseller, and VPS
█ http://www.mddhosting.com/
#14
Posted 03 April 2011 - 01:30 PM

█ Scalable shared hosting plans in the cloud! Check them out!
█ Highly Available Cloud Shared, Reseller, and VPS
█ http://www.mddhosting.com/
#15
Posted 03 April 2011 - 11:02 PM
if everything is finished, why the fresco server still reboot and slow?I just checked and everything finished a while ago so we're going to mark this as resolved
#16
Posted 03 April 2011 - 11:04 PM
The backup process runs every night and does cause some slowness while it gets up to speed. If you're experiencing an issue, open a ticket.if everything is finished, why the fresco server still reboot and slow?
█ Scalable shared hosting plans in the cloud! Check them out!
█ Highly Available Cloud Shared, Reseller, and VPS
█ http://www.mddhosting.com/
#17
Posted 04 April 2011 - 01:24 PM
P.s.
Ticket sent.
#18
Posted 04 April 2011 - 01:26 PM
#19
Posted 04 April 2011 - 01:26 PM
I can confirm that my site on cypress is experiencing the same issue. CloudFlare cache is kicking in b/c my site is timing out.Just a note, because it might be related seeing as this is the most recent action taken on the server I'm on. We've been extraordinarily slow for about an hour now. Cypress server. 8+ seconds to load a page on my sites.
P.s.
Ticket sent.
#20
Posted 04 April 2011 - 01:29 PM
The current server load doesn't appear bad. Must be an I/O issue?:I'm surprised you're able to access your site. For me, my site doesn't even load.
Server Load 7.13 (16 cpus)
Memory Used 45.9 %
Swap Used 0.61 %
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users