Jump to content
MDDHosting Forums

SR3 - MySQL Unavailability


Recommended Posts

A series of events lead to the MySQL server on SR3 being unavailable for approximately 6 hours tonight on the SR3 server.

 

Due to a provisioning error the server's partitioning scheme was incorrect resulting in the MySQL data residing in a partition we do not actively monitor for consumption. This partition was filled by MySQL entirely and data corruption occurred to a vast number of databases residing on the server.

 

This wasn't a typical outage where a piece of hardware could simply be replaced or a piece of software restarted. In this case we spent the last 6 hours working as hard as we could to recover the corrupted data and restore the service with as little impact as possible.

 

We were able to bring quite a few customers online about 3 hours in, however, I do not have a specific list as we were working on the server as a whole and not individual accounts.

 

I know that there were two databases that were completely beyond repair and we are reaching out to the affected customer to restore backups and get them back online. All other databases should be intact.

 

We then ran into an obscure issue with CageFS - a CloudLinux technology that provides each user a 'safe bubble' for their data protecting it from everybody else on the server - that prevented some accounts from accessing MySQL even though it was online and operational. The fix for this is currently in progress and I estimate is approximately half done. It is moving at about the rate of 1 account every 4 seconds so everybody should be 100% online very soon.

 

I do apologize for the trouble this may have caused you as well as the lack of a forum post/updates during the issue. To be honest I was focused on resolving this issue as quickly as possible and we did respond to all support tickets opened in a timely fashion. Even though you may not haven noticed this MySQL outage and your monitoring may not have picked up an outage we still wanted to post what happened and why.

 

We are going to be evaluating our provisioning process and determining ways to ensure that mistakes that lead up to this issue cannot happen in the future.

 

If you have any questions about this outage in general please feel free to ask them here. If you have questions specific to your account or you are currently having issues accessing MySQL on your account - please open a support ticket.

Link to comment
Share on other sites

Just a small note of sincere thanks for making this middle-of-the-night fix to what may have been a relatively new server. As they say, stuff happens. Despite the hour, you rallied to make the extensive fix and go above and beyond to assure that every account was functioning perfectly.

 

Thank you seems insufficient given all your work. Such dedication is what makes MDDHosting truly exceptional.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...