Jump to content


Photo

Major Outage - 09/21/18+ - Client Discussion


  • Please log in to reply
419 replies to this topic

#401 PhilD13

PhilD13

    Newbie

  • Members
  • Pip
  • 7 posts

Posted 27 September 2018 - 05:29 AM

command line commands generally don't ask are you sure. If you have the authority to run the command it executes immediately.


  • 0

#402 SarisIsop

SarisIsop

    Advancing Member

  • Members
  • PipPipPip
  • 152 posts
  • Gender:Not Telling

Posted 27 September 2018 - 06:23 AM

Speaking of full cPanel backup, which option is this in cPanel?

 

I have just posted a newbies guide if it helps:

 

https://forums.mddho...up-for-newbies/


  • 0

#403 PhilD13

PhilD13

    Newbie

  • Members
  • Pip
  • 7 posts

Posted 27 September 2018 - 06:28 AM

A few things I did and a few things I learned. I have a reseller account with a few clients.

I found out about the outage about 10 minutes after it happened. but it was about 30 minutes before I could verify my sites were truly offline. I think by that time or soon after I received an email from Mike about the issue.

 

The first suggestion I made to one of my clients was to change what email forwarding to their domain was being done over to their Gmail account. While inconvenient and as the outage went on, they did have some mail bounce they were able to continue to accept and service orders as their main site is not hosted by MDD. They were very grateful for the suggestion even though their secondary domain hosted on MDD was down.

 

Second thing I did was I informed my other clients that there would be an extended outage of their websites as there was an issue in the data center hosting them and I would keep them updated as to when I thought the outage would be over.

 

The third thing was verify how new the offline backups that I auto generate and auto download and keep of the site files and databases for each client, A few days old and sites don't change much. Good there but I was not doing cpanel full backups so no email backed up. My bad there, never thought about it.

 

Next I waited and monitored until late Friday evening to see if things would recover. When nothing had recovered after several hours, I sent a trouble ticket asking if any servers were still operational and if the outage would go past Noon on Saturday. VPS servers were operational and I could get an account on one to restore the sites if I wanted. Mike thought it would be fine by early morning. This was before they knew the data arrays were corrupted and could not be recovered.

 

Saturday noon came and went and talk was not good about the situation. I informed my clients and asked if they were good to wait until the servers were restored from backups which could be Tuesday or Wednesday, or if they wanted me to move their sites temporarily They elected to just wait and those domains were not likely to miss any emails.

 

A few things I learned that might help others in the future.

I thought I was prepared, as I have in the past had hosts disappear in the night, owner die, etc., but learned I was not fully prepared to recover all data. I need to make sure I have email backed up and a full cpanel copy in addition to site and database copies that I already do.

I was much better off than those who had stored clients backups online, did not have current ones or any.

No matter how good the host service, how big (or small), how much trust there is, how good their customer service is, always be prepared to have all of your data and email or a customers data and email lost in an instant. Head the "you are responsible for all data, and your own backups, we are not responsible for lost data" that every host posts.

Don't wait forever to inform your customers and be honest with them about the issue.

This I already know from my primary job. Whatever the problem it will take at least 4 times as long to fix than what is first thought.

Have a plan to bring your clients websites back online quickly if the outage will be extended even if it costs you money in the short term.

Send a follow up letter summarizing the outage, and what you will or can do better in the future to prevent extended downtime in the future. 


  • 0

#404 MisterNeutron

MisterNeutron

    Newbie

  • Members
  • Pip
  • 4 posts

Posted 27 September 2018 - 06:57 AM

I don't think I've seen anyone mention this, but while my sites on s2 all came back properly, I did lose all of the logs that are used for things like awstats and Webalizer. In fact, Webalizer was turned off (the normal default). Personally, I don't care about this, but if anyone is using the metrics available through cPanel, you might want to take a look at what's there (in the tmp directory).


  • 0

#405 mdd_shared_user

mdd_shared_user

    Newbie

  • Members
  • Pip
  • 8 posts

Posted 27 September 2018 - 06:59 AM

command line commands generally don't ask are you sure. If you have the authority to run the command it executes immediately.

I think that's probably true, but as has been quite painfully demonstrated at MDD, makes no sense when the command can be so destructive.  I don't expect that administrators can control it but confirmations would help:

 

   Are you sure you want to DELETE all data on ALL SERVERS?  Type YES to continue, NO to cancel.

 

Or, if the confirmation text could be configured:

 

   Are you sure you want to put MDD out of business?  Type YES to continue, NO to cancel.

 

Also, is a block discard command EVER used on these systems?  I simply can't get my head around how this command, significantly different than the cleanup command, could be entered.

 

Regardless, it happened and Mike and crew did the right thing, an outstanding effort to restore as quickly as possible.  And they are addressing the issues that caused it and will make recovery faster and less traumatic in future.


  • 0

#406 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,883 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 27 September 2018 - 10:16 AM

I think that's probably true, but as has been quite painfully demonstrated at MDD, makes no sense when the command can be so destructive.  I don't expect that administrators can control it but confirmations would help:

 

   Are you sure you want to DELETE all data on ALL SERVERS?  Type YES to continue, NO to cancel.

 

Or, if the confirmation text could be configured:

 

   Are you sure you want to put MDD out of business?  Type YES to continue, NO to cancel.

 

Also, is a block discard command EVER used on these systems?  I simply can't get my head around how this command, significantly different than the cleanup command, could be entered.

 

Regardless, it happened and Mike and crew did the right thing, an outstanding effort to restore as quickly as possible.  And they are addressing the issues that caused it and will make recovery faster and less traumatic in future.

 

We've actually blacklisted the command.  It won't ask, 'Are you sure,' it will say, 'You are not permitted to run this command.'

 

There are reasons you would use a block discard - it's a valid command when used in the right situation.  This wasn't one of them.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#407 Arpeggio

Arpeggio

    Member

  • Members
  • PipPip
  • 27 posts

Posted 27 September 2018 - 10:27 AM


   Are you sure you want to DELETE all data on ALL SERVERS?  Type YES to continue, NO to cancel.

 

 

An authorization code only known by Mike and written on some paper in a safe somewhere. Given what is at stake I don't think that would be over the top.


  • 0

#408 grentz

grentz

    Newbie

  • Members
  • Pip
  • 1 posts

Posted 27 September 2018 - 10:51 AM

Thanks for the work and updates. 

These type of events are never fun for the client or the provider. 

 

While it was far from ideal, things do happen. I have had things happen even on AWS with multiple layers of redundancy. 

 

The biggest message I would leave for the folks that are really pushing on business disruption is if your services are that critical, have layers of redundancy that you manage as well. 

For example, my eggs are not all in one basket for critical services. NameServers, DNS Zones, Web Hosting, and Email are all on separate providers that can be independently managed. 

 

If your email is that critical, you really should not be hosting it on your webhost....get a Google or Office365 account or something more specialized. 

If your site is that critical, have a backup and mirrors that you can bring online.

 

No technology is perfect, even ones we pay exponentially more for. 


  • 0

#409 teacdan

teacdan

    Newbie

  • Members
  • Pip
  • 2 posts

Posted 27 September 2018 - 12:23 PM

We’re evaluating what options there are so that hopefully we can offer such functionality for you. I know it’s doable with a custom script of some kind but it would be nice for it to be built in.

 Thanks. Please keep us posted on this progress. It would be nice to implement it as soon as possible.


  • 0

#410 jeffs

jeffs

    Newbie

  • Members
  • Pip
  • 1 posts

Posted 27 September 2018 - 10:33 PM

Mike, just want to thank you again for such great service over the years and for your hard work and transparency during this whole ordeal. I know this has been very stressful and problematic for you and for so many people. 

 
Your professionalism has been exemplary – working hard within the limits of human sleep deprivation and equipment bottlenecks, keeping us updated so frequently, outlining your plans for improving things in the future and responding calmly no matter what the tone of the message.
 
I've also been impressed by the high levels of professionalism, tolerance, compassion and helpfulness of the user posts. Feels like a solid community that I'm happy to be part of.
 
I'm following your lead and thinking hard about what I can do differently to minimize disruption should a major problem arise in the future. As a non-IT professional with a couple of low-traffic non-commerce websites and lots of email accounts in my domains, I've been good about keeping website backups with a WordPress plugin but honestly didn't realize that cPanel backups would include all the email accounts so I thought they were redundant and didn't do them. And I was one of the people who didn't understand that the servers were up and I could have restored my websites with my backups until you spelled it out on Monday in this thread (I'm not on Twitter).
 
So going forward I'm going to keep up-to-date cPanel backups and I'll remember that “the data needs to be restored” does not mean “the server is down” (I'm sure that's glaringly obvious to an IT professional, but we enthusiastic amateurs, well, our minds work in some strange ways).
 
Maybe this would be a good time to update/expand a couple articles in the Knowledgebase, to help us do our part? I see there is a category called “Backups and Restorations” that has no articles in it, and two brief articles on backing up and restoring in the “Account Questions” category.
 
Thanks again for your commitment to maintaining such high standards. We all make mistakes, what matters is how we respond to them, and again, I think you have been exemplary. I'm sure that this episode is costing MDDHosting a bundle. Please let us know if it compromises the viability of your company – I'm sure there are a lot of us who would happily jump on that GoFundMe train rather than see MDDHosting go out of business.

  • 0

#411 KevinJones

KevinJones

    Newbie

  • Members
  • Pip
  • 12 posts

Posted 28 September 2018 - 07:50 AM

From the 2 cents department, and maybe this has been talked about already (this is a loooong thread), I wonder if you have considered implementing software that requires 2 separate users entering their password for commands that are potentially catastrophic.  (and no cheating :-))


  • 0

#412 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,883 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 28 September 2018 - 08:25 AM

From the 2 cents department, and maybe this has been talked about already (this is a loooong thread), I wonder if you have considered implementing software that requires 2 separate users entering their password for commands that are potentially catastrophic.  (and no cheating :-))

It has been mentioned. We already have a system that outright blocks destructive commands but this command wasn’t on the list. It is now, that is for sure.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#413 cziv

cziv

    Member

  • Members
  • PipPip
  • 52 posts
  • Gender:Male

Posted 29 September 2018 - 01:18 PM

Alas ,,, all AWstats are GONE from Cpanel statistics, i mean they are all zeroed.


  • 0

#414 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,883 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 29 September 2018 - 01:21 PM

Alas ,,, all AWstats are GONE from Cpanel statistics, i mean they are all zeroed.

cPanel, in their infinite wisdom, stores stats data in the temporary folder. It’s something I will be addressing with both JetBackup as well as cPanel on Monday or Tuesday.
  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#415 cziv

cziv

    Member

  • Members
  • PipPip
  • 52 posts
  • Gender:Male

Posted 29 September 2018 - 01:45 PM

cPanel, in their infinite wisdom, stores stats data in the temporary folder. It’s something I will be addressing with both JetBackup as well as cPanel on Monday or Tuesday.

 

I have a full Cpanel backup of my own, could i do something ? upload to the temp folder what awstats need ?


  • 0

#416 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,883 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 29 September 2018 - 01:46 PM

 

I have a full Cpanel backup of my own, could i do something ? upload to the temp folder what awstats need ?

Yes.  You could extract the temporary folder out of the backup and upload it to the account.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#417 cziv

cziv

    Member

  • Members
  • PipPip
  • 52 posts
  • Gender:Male

Posted 29 September 2018 - 01:50 PM

Yes.  You could extract the temporary folder out of the backup and upload it to the account.

 

Mike,

 

I see homedir/tmp/awstats

 

is that the ONLY directory i should upload or WHOLE tmp ?


  • 0

#418 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,883 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 29 September 2018 - 01:56 PM

Either way.  It's a temporary folder - meaning it should be able to be wiped without causing harm which is the main reason I've never liked that cPanel stored anything persistent there.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#419 AMC4x4

AMC4x4

    Newbie

  • Members
  • Pip
  • 24 posts

Posted 01 October 2018 - 03:24 PM

Just checking in to see how things are going with you and the crew, Mike. Hope you all managed to get some sleep this weekend and that the engineer who made the mistake is feeling a little better. Thanks again for all the updates. 


  • 0

#420 LShoe

LShoe

    Newbie

  • Members
  • Pip
  • 13 posts

Posted 09 October 2018 - 12:33 AM

One more suggestion: I didn't even know about these forums until I searched on Google for info about the outage. Now that I'm looking for it I do see at the bottom of your home page a link to Community Forums, but within the Client Area I don't see any link or information. I would suggest adding a link in the Client-Area Support dropdown menu. I bet this would also cut down on the number of tickets submitted.


  • 0




4 user(s) are reading this topic

0 members, 4 guests, 0 anonymous users