Hi, with respect to all the good work done at MDD (and I really do mean that), these outages do let down MDD hosting. I have a huge historical list of alerts I could pull up from various monitoring tools of outages over the year or so I've been with MDD, most of them minor some not so minor. As such I have been subjected to a number of outages for which there is always going to be a reason and perhaps this is just the price we pay for the value we get in other areas from MDD.
We actually invested heavily in our infrastructure Q1 of last year - replacing everything. All brand new current-generation servers, networking equipment, etc.
We haven't had any outages that I can remember related to hardware failure although we've had hardware fail and our High Availability setup made the failure seamless and nobody noticed.
That said - the network provided to us hasn't been as stable lately as it has been historically. This is something I'm going to have a call with our facility about here sometime soon. At some point I'd really like for us to be running our own routers and managing our own ISP connectivity but the cost is prohibitive at our scale. We're already paying a premium for 2x10 GPBS connections to the internet from our upstream.
In this case it seems as though there was no change plan, or at least it was not a well tested one.
The border routers needed updated to the latest JTAC Recommended version of JunOS. One was brought down for the update and, while it was down, the other crashed. Under normal circumstances if one were to crash the other would take over - and this has been tested in practice and reality. Why it failed last night is uncertain and Juniper support has been engaged to investigate the issue. Once a cause is known - the maintenance will be re-scheduled and the problem experienced last night avoided. Unfortunately things don't always go to plan - being the real world and all
. That said - we do our best as I am sure our upstream provider does.
This area of uptime / availability is also the one area that is mentioned over and over again in hosting comparisons as a negative of MDD Hosting. I believe MDD hosting has one of the highest rates of outages of any hosting provider which really undermines all the other good work that goes on. Even so, MDD is well rated, fairly priced and known for good speed.
We have relatively few outages overall. Most of our servers are in the 99.95% to 99.99% range over the last year.
All of this said - we are working on a new platform that will provide HA, increased speed and consistency, and ideally the only time we should see any issues or outages is if the power provided to us or the network provided to us were to fail us.
I guess what I'm trying to do is, to encourage you to imagine what your hosting company would be like without these outages. I suspect you would become number one or two in the reviews if you could fix this problem and gain a lot more customers / growth as a result.
We're always working to improve but a majority of the outages over the last year have been at no fault of our own and there's no way we could have prevented them. For example a DDoS that's large enough to disrupt the entire facility's network isn't something we could prevent [and it has happened a few times].
You guys do an amazing job and I understand you're a small team, certainly one or two guys can't remember to do everything and from that point of view, I'm not fussed that your twitter account was broken etc. Had I had a large site, that could of course have translated to a lot of money and I would be looking elsewhere.
I totally understand - and we're always working to improve. I always appreciate any and all feedback - especially if it's a customer telling us we're not doing something well.
Food for thought?
Thanks for reading.
Absolutely. Do feel free to reach out via the ticket system anytime with any feedback you have. If you want you can ask for me and they'll transfer the ticket over - just bear in mind I'm not always around 24/7. That said - you can also PM me here.