Jump to content
MDDHosting Forums

Echo Server - File System Errors, Reboot and FSCK Required


Recommended Posts

The Echo server has gone intermittent on us and upon logging into the console it's apparent that the file system has become corrupted.

 

We are issuing a reboot and will need to initiate a file system check. The file system check could take up to 2 hours, however, we will be providing updates via this thread as it progresses.

Link to comment
Share on other sites

The server is back online and is working on catching up due to the flood of requests. I did disable SSD caching so it will take a little longer than normal but it should be back to normal in ~15 minutes.

 

I think the databases are offline, since i get "error establishing connection".

Link to comment
Share on other sites

When is echo forecast to be back up? Is there no back up plan when servers go down at all?

It is back online now but it will take time for it to stabilize. Due to the SSD Caching being the cause of the corruption we had to disable it.

 

That said, can you elaborate as to what you expect when you ask about a 'back up plan'? We do have backups of all servers but restoring the backups isn't something that would be instantaneous.

Link to comment
Share on other sites

At my business we have redundancy built in to get the content back up on another server when something like this happens. Was wanting to know if I am going to have to wait any time anything goes down until you fix whatever was down, be it drives, power supply or whatever the problem might be. I am new here and was wondering. If I let one of my servers at work stay down for an hour I would be looking for a new job.

Link to comment
Share on other sites

At my business we have redundancy built in to get the content back up on another server when something like this happens. Was wanting to know if I am going to have to wait any time anything goes down until you fix whatever was down, be it drives, power supply or whatever the problem might be. I am new here and was wondering. If I let one of my servers at work stay down for an hour I would be looking for a new job.

There's no way we could provide such redundancy for a whole server at the price point of $7.50/month - such redundancy costs on the order of a couple hundred dollars per month [for a single account/user/site/need]. We do have spare drives and spare power supplies, mother boards, etc but the problem is that when the file system becomes corrupted it's not as simple as just swapping a drive - a file system check is required and there's no way around that. It has to scan the whole storage system and look for repairs and fix them. The file system being scanned and repaired has to be offline during this time so while the server is technically 'on' - it's not serving requests.

 

To have the sort of rendancy you seem to be expecting would require us to literally double our infrastructure and to have half of that infrastructure sitting idle and then we'd also have to have the automation in place to ensure the backup hardware [online, doing nothing] was staying 100% in-sync with the active hardware.

 

At the price point of regular shared hosting we simply can't provide high availability hosting in the sense that you seem to be expecting - there are services out there that do offer this but the're on the order of $100/mo+ for a single user/site/account.

 

I think your expectations and reality are simply a little out of sync in this case but let me know if I can clarify further.

Link to comment
Share on other sites

The facility is pulling both solid state drives used for caching and is swapping in a known-good drive for us so we can re-enable the caching. This will not incur any downtime.

 

Once the new drive is installed we will re-enable caching and server performance should return to normal or very near it.

 

We will be testing both removed drives to find the failure and then replacing it with a new drive and restoring the original caching configuration.

 

This should all be transparent beyond once the caching is re-enabled the server will be faster/quicker to respond.

Link to comment
Share on other sites

As I said I am new to this whole thing and decided to try this on a shared with the better shared plain. So do any of your plans have redundancy built in with expectations of getting a site back up with an hour or so? And I am paying 46.50 per quarter not 7.50- a month---- I guess I should have asked more questions when I signed up but you had glowing reviews. I didnt expect to have it go down so soon into joining.

Link to comment
Share on other sites

As I said I am new to this whole thing and decided to try this on a shared with the better shared plain.

Sure - I'm not trying to be condescending but just trying to be transparent/straightforward so that you know exactly what to expect.

 

So do any of your plans have redundancy built in with expectations of getting a site back up with an hour or so?

The server is back online, and has been for a little while now.

 

It was brought back online as quickly as we could - we obviously don't want things offline any more than any one of our customers does [in short - not at all].

 

And I am paying 46.50 per quarter not 7.50- a month----

Even that is nowhere near the cost of high availability hosting for critical sites. If you really do need high availability with load-balancing [i.e. dual hardware to handle requests if one piece of hardware fails] I'd suggest looking into it - most cannot or choose not to afford it.

 

I guess I should have asked more questions when I signed up but you had glowing reviews.

Hardware failure can happen to any provider - even High Availability services can and do have downtime. Google has been down [and they have numerous data center facilities and tons of redundancy], Amazon AWS has been down for days at a time, CloudFlare has had downtime - it happens. We can't ever promise there won't be issues but what we do promise is that we'll get them resolved as quickly as possible and will keep you informed/updated along the way as to the progress, what happened, why it happened, and what we're doing about it [i.e. what we're doing in this thread].

 

I didnt expect to have it go down so soon into joining.

Nobody "expects" downtime but it's a fact of life - have an online presence long enough and you're bound to experience some. By comparison we have customers that have been on the server you're on for well over a year without any downtime - the difference between them and you is simply that they've been with us longer.

 

If you have any other questions or concerns at all just let us know - we're happy to answer any and all of your questions.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...