Michael D. Posted September 2, 2017 Author Report Share Posted September 2, 2017 CageFS, a technology we use to isolate user accounts from each other, is causing the offline transfer to take longer than anticipated. Quote Link to comment Share on other sites More sharing options...
Michael D. Posted September 2, 2017 Author Report Share Posted September 2, 2017 The offline sync is taking longer than expected due to virtual file systems and some files and folders we neglected to copy during the online sync. We'll continue to keep this thread updated. Quote Link to comment Share on other sites More sharing options...
ericr Posted September 2, 2017 Report Share Posted September 2, 2017 To be clear, the home directory was not cleanly copied over in the sync we did before we took the server offline. Right now I am correcting the problems and will update as soon as I have more news. Quote Link to comment Share on other sites More sharing options...
ericr Posted September 2, 2017 Report Share Posted September 2, 2017 The rsync is not currently finding missing files. This is good news. I will update further as soon as possible. Quote Link to comment Share on other sites More sharing options...
ericr Posted September 2, 2017 Report Share Posted September 2, 2017 I have found more directories with issues. I will update when these copies are complete. Quote Link to comment Share on other sites More sharing options...
ericr Posted September 2, 2017 Report Share Posted September 2, 2017 We are about 2/3rds through the home directory. Quote Link to comment Share on other sites More sharing options...
ericr Posted September 2, 2017 Report Share Posted September 2, 2017 I have aborted the work. We will be rescheduling the work for next weekend after we reasses the procedure. Quote Link to comment Share on other sites More sharing options...
Michael D. Posted September 2, 2017 Author Report Share Posted September 2, 2017 We will be re-evaluating migrating the server as a whole. The plan was for it to take no more than a couple of hours and for no IP addresses or Server Names to change. Unfortunately this method may not work and we may be forced to perform standard cPanel migrations. We were able to use this method to migrate all of our internal servers and systems but our old storage platform simply may not be able to keep up with the amount of data transfer required to migrate one of these servers as a whole. If this is the ultimate outcome we will work with you as much as possible to make the transfer process as smooth as possible. Quote Link to comment Share on other sites More sharing options...
Michael D. Posted September 7, 2017 Author Report Share Posted September 7, 2017 I hate to do it - but we're scheduling another maintenance window for the R3 server this Friday, September 8th, 2017 from 10 PM ET to 2 AM ET [GMT-4]. The last failed attempted did not take very long to copy data we actually spent a fair bit of the time attempting to get the new server to boot successfully. We have performed extensive testing since the transfer failure of the R3 server this last Saturday and we are confident that we have resolved all of the outstanding issues that prevented the transfer from succeeding. We plan for the actual outage to be less than an hour, however, we will keep this thread updated as we did with the last attempt. I did intend to send out an email on Tuesday, as Monday was a holiday, detailing the failure and what we're doing to correct it but admittedly I've been so focused on getting the issues resolved that it slipped my mind. I am sending out an email now to all affected clients about this. If you have any questions you can ask them here or reply to the email you're about to receive, if you're on the R3 server, and we'll be happy to help. Quote Link to comment Share on other sites More sharing options...
Jeroen Posted September 8, 2017 Report Share Posted September 8, 2017 Not my server, but still GOOD LUCK! Quote Link to comment Share on other sites More sharing options...
Michael D. Posted September 9, 2017 Author Report Share Posted September 9, 2017 We will be starting maintenance on the R3 server in about 20 minutes. Quote Link to comment Share on other sites More sharing options...
ericr Posted September 9, 2017 Report Share Posted September 9, 2017 I have brought R3 offline and I am starting the offline synchronization process. Quote Link to comment Share on other sites More sharing options...
ericr Posted September 9, 2017 Report Share Posted September 9, 2017 We are making good strides. We are about halfway through the home dir after a 30 minute delay caused by the vmware cluster re-configuring the network interfaces unexpectedly. Quote Link to comment Share on other sites More sharing options...
Michael D. Posted September 9, 2017 Author Report Share Posted September 9, 2017 The data sync is still in progress. Quote Link to comment Share on other sites More sharing options...
ericr Posted September 9, 2017 Report Share Posted September 9, 2017 The home directory is synchronized. The process has completed cagefs as well and is now working on mysql. Quote Link to comment Share on other sites More sharing options...
ericr Posted September 9, 2017 Report Share Posted September 9, 2017 We have only one rsync thread running on /usr Quote Link to comment Share on other sites More sharing options...
ericr Posted September 9, 2017 Report Share Posted September 9, 2017 R3 is online and happy on the new cluster. Thank you very much for your patience during this work. Quote Link to comment Share on other sites More sharing options...
Jeroen Posted September 9, 2017 Report Share Posted September 9, 2017 Wow that went well. Good job guys! Quote Link to comment Share on other sites More sharing options...
Michael D. Posted September 9, 2017 Author Report Share Posted September 9, 2017 We do not expect any issues as a result of this migration as the Operating System, Control Panel, Files, Daemons, everything are exactly the same as they were. In the event that you do run into anything odd do please reach out to support. Quote Link to comment Share on other sites More sharing options...
cziv Posted September 9, 2017 Report Share Posted September 9, 2017 So, what is next ? Only 6 days left until 15 of September for the rest of us, as your email initially was about. Quote Link to comment Share on other sites More sharing options...
Michael D. Posted September 9, 2017 Author Report Share Posted September 9, 2017 So, what is next ? Only 6 days left until 15 of September for the rest of us, as your email initially was about.Due to the overwhelming success of this latest migration we'll be reaching out on Monday to clients to schedule the rest of the transfers. It is possible that we may have to push some transfers past September 15 in order to avoid trying to move too much data too fast and choking our former storage platform but we're going to get everybody moved as soon as possible. We definitely do not want to go past the end of September otherwise we're going to be paying for both the old storage platform and the new one for an additional month. I'm providing some graphs that show disk i/o latency both before and after. Keeping in mind that these are logarithmic these are pretty massive improvements in addition to the stability of the graphs after the upgrade SDA is the "/" folder on the server - basically everything besides MySQL and the "/home" folder which stores user data.SDB is the "/home" folder on the server - which stores your home directory, account data, php files, emails, etc.SDC is the MySQL folder on the server. This graph is for the root folder of the server, "/", and the green spikes do not affect the performance of client sites in any way. They are from scheduled tasks that run on the server that are very I/O intensive such as accounting, bandwidth tracking, etc. This graph for SDA, the root "/" folder, shows the average latency on reads and writes have gone down as well as the I/O wait time - again this chart is logarithmic so it's a pretty large improvement. This graph for SDB is for the "/home" folder - the blue line for IO wait is much lower and everything is substantially lower beyond the write IO wait time. This is to be expected as we are replicating 3 copies of the data on every write for redundancy against failure. Reads are substantially faster and more consistent. This graph for SDC is for MySQL and it is plainly obvious that all of the metrics are lower [lower is better] as well as more consistent. MySQL actually has one of the largest impacts on the performance of dynamic sites like WordPress, Joomla, Drupal, Magento, etc... This is a massive improvement in both consistency as well as speed. Quote Link to comment Share on other sites More sharing options...
Jeroen Posted September 9, 2017 Report Share Posted September 9, 2017 Just to be sure, I can post and make changes to my sites like I normally do until the offline sync of data, right? Quote Link to comment Share on other sites More sharing options...
Scott Posted September 9, 2017 Report Share Posted September 9, 2017 Just to be sure, I can post and make changes to my sites like I normally do until the offline sync of data, right?As a general rule: Yes. To be safe, I'd suggest to save your work and disconnect a few minutes before the shut down. Imagine you are writing a blog post, and need to press save. You're writing and everything is going swell, but we do the shut down a few minutes before you finish. You hit the save button and...... you get a connection error. Depending on when you last saved, some work could be lost. Quote Link to comment Share on other sites More sharing options...
Michael D. Posted September 11, 2017 Author Report Share Posted September 11, 2017 After some time has passed on the new platform here's an updated graph showing this weekend as well as most of today on the R3 server as compared to the old storage:\Â Lower is obviously better - the top 3 lines in the second graph are the writes to disk. The bottom 3 lines are reads from the storage. Writes will be slower than reads due to the data duplication for redundancy but even with this - the overall performance is very much improved. Quote Link to comment Share on other sites More sharing options...
Rhody401 Posted September 15, 2017 Report Share Posted September 15, 2017 Very impressive. Is R1 still scheduled for tonight? (9/15/17) Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.