Jump to content
MDDHosting Forums

R1, S1, P1, SpamExperts Scanners outage

Recommended Posts

So, for two hours the problem was on another datacenter that initially was thought?


A simple tracert was showing where the network problem was:


Tracing route to ************* []
over a maximum of 30 hops:
1 <1 ms 1 ms 1 ms
2 1 ms 1 ms 1 ms
3 45 ms 39 ms 39 ms
4 599 ms 52 ms 53 ms athe-crsb-hera-gsra-1.backbone.otenet.net []
5 77 ms 53 ms 47 ms ten0-1-0-0-crs01.ath.oteglobe.gr []
6 91 ms 91 ms 91 ms
7 320 ms 96 ms 139 ms 40ge1-3.core1.lon2.he.net []
8 * 177 ms 158 ms 100ge1-1.core1.nyc4.he.net []
9 182 ms 176 ms 186 ms 100ge7-2.core1.chi1.he.net []
10 250 ms 208 ms 199 ms 10ge15-2.core1.den1.he.net []
11 199 ms 198 ms * handy-networks-llc.gigabitethernet2-11.core1.den1.he.net []
12 * * * Request timed out.
13 * * * Request timed out.
Link to comment
Share on other sites

The fault was with the second DC location. They datacenter has links from the main datacenter to the second datacenter as well as direct internet links. The current reported fault is that the issue was with level 3 and the connection to that data center in denver.

Link to comment
Share on other sites

Wow this is a long one. i sure hope we dont have to change servers again....dont think I could handle another move.


Thanks for the updates as I have clients waiting for answers. I know you are doing what you can

The migrations were to move us to the new hardware/infrastructure and then some secondary migrations to move from CentOS7 to CentOS6. Everybody is on CentOS6 now and there is nothing wrong with our new hardware, servers, network. The issue is outside of our border and we have zero control over it.


At this point we're at the mercy of our facility and they are at the mercy of their transit providers. This isn't affecting just us - it's affecting everybody in the facility which is tens of thousands if not hundreds of thousands of users - us included.


I do apologize for this outage and will most certainly pass on the Reason For Outage, or RFO, once it is available from our upstream provider.


Here are the status updates from them [not very descriptive but will give you an idea of the level of information we've had available to us]:


Update - 02:06AM MDT:

Connectivity to our DTC location has been restored, but we are still working with Level3 to ensure that the problem has been completely resolved.


Update - 01:50AM MDT:

Level3 has identified an issue in the Denver Metro area, and is working to resolve it. We will continue to provide updates as we receive them.


Update - 12:41AM MDT:

We are in contact with our transit provider to identify any network connectivty problems. We are also on-site reviewing our network gear for issues.


We have been alerted of connectivity issues at our Denver Tech Center location. We are working on the issue as quickly as possible and will update here.

We are 2 hours ahead of MDT.
Link to comment
Share on other sites

That said the network is online and operational and has been prior to my last update of this thread.


None of our networking gear or servers had any issues. The best analogy I can make is that there was an accident on the highway between us and the internet - on a portion of the road that is not within our control. This stopped traffic from entering/leaving until the issue was resolved.


That said I am certainly going to get with the facility concerning this as *one* provider out of several dropping/having issues should not result in a total lack of connectivity. This does defeat the whole purpose of having multiple transit providers available to us.

Link to comment
Share on other sites

Is this the same data center you were using before?


It's good that it's back up, but this is the 4th outage I've experienced since the switch to new servers and site users are starting to question the reliability of the site.

The reliability of the *site* has nothing to do with the transfers and how good or bad they went for you. The facility itself was without connectivity as well - not just us. As a matter of fact to speak plainly - yes - the migration you experienced did not go as smoothly as it should have and for that I apologize.


This was a networking issue outside of our control. It had nothing to do with migrations.


I am sorry that you experienced issues with the migrations, however, if you wish to discuss those please open a ticket and ask for me and I'll be happy to discuss them with you.

Link to comment
Share on other sites

The very short outage for the S1 server just now was not related to this. It was due to a LiteSpeed licensing issue which I resolved.


I am going to be opening a ticket with LiteSpeed as it is supposed to switch to Apache when there is a LiteSpeed issue with licensing but it did nothing [LSWS didn't start, Apache didn't get started].


I'm just updating this thread as a couple of users have asked if the issues were related.

Link to comment
Share on other sites

I have more detail from the facility.


We are located in the H5 location in Denver and our bandwidth is transported by Level3 over two physically diverse 10 G links from H5 to 1801 where our bandwidth reaches the carriers.


The idea of physically diverse fiber routes is that one could go down and we should retain connectivity [hardware failure, a fiber cut, etc]. The facility is working with Level3 currently to ascertain why this was able to take down both links as well as what is going to be done to ensure it doesn't happen again.


It is a bad day all around for you, for us, and for our provider. All of us experienced an outage that was not within our control and was unplanned.


Once the official RFO is released detailing exactly what happened and why I will make it available but as the issue wasn't on our end/within our control I do not have an ETA.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...