Jump to content


Photo

S1 / R1 Servers - Network Device Updates - ~9 PM ET Jan 12, 2016.

Scheduled

  • Please log in to reply
42 replies to this topic

#1 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 12 January 2016 - 05:32 PM

Hello!

 

One of the great features of our new infrastructure is that we can migrate a server from one piece of hardware to another seamlessly.  This allows us to perform maintenance without downtime, etc...

 

Unfortunately the networking cards in the new hardware were not properly updated to the latest version resulting in some weird latency issues every once in a great while.  We've been monitoring both servers on a second-by-second basis since we resolved the LiteSpeed issue and the server has been stable.

 

The issue comes in when we do wish to seamlessly migrate the server - the host goes unresponsive for 30 to 60 seconds as the network interface crashes and restarts.

 

We will be bringing down the S1 and R1 servers tonight for approximately 5 minutes [as long as it takes to shut down and boot back up] to update the networking firmware after which we should be able to perform maintenance in the future without any scheduled downtime whatsoever.

 

We expect to begin around 9 PM ET and expect the downtime to be no greater than 5 minutes.  We do apologize for the growing pains we are experiencing with this new hardware - it's a completely new setup from what we have run for years and we're running into small edge issues that we didn't anticipate.

 

If you have any questions about this - let us know.  We'll keep it as seamless as possible - and there is nothing for you to do / that you need to do.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#2 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 12 January 2016 - 07:34 PM

It looks like the S1 server's networking card has gone unresponsive.  We're bring it back online on another piece of hardware and it should be online momentarily.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#3 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 12 January 2016 - 07:37 PM

S1 is back online on another piece of hardware.  ~1 hour 20 minutes until we begin updating firmware on networking controllers.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#4 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 12 January 2016 - 08:11 PM

It looks like we can no longer wait to flash this update - we're going to have to proceed immediately.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#5 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 12 January 2016 - 08:48 PM

S1 should be stable at this time.  Everything is updated and running smoothly.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#6 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 18 January 2016 - 11:46 AM

For those that want more detail - this is the issue we're currently facing with the S1 server:

Message from syslogd@s1 at Jan 16 17:17:27 ...
 kernel:BUG: soft lockup - CPU#44 stuck for 23s! [imap-login:1021336]

Message from syslogd@s1 at Jan 16 17:17:27 ...
 kernel:BUG: soft lockup - CPU#11 stuck for 22s! [migration/11:159]

Message from syslogd@s1 at Jan 16 17:17:27 ...
 kernel:BUG: soft lockup - CPU#1 stuck for 144s! [mysqld:807721]

Message from syslogd@s1 at Jan 16 17:17:27 ...
 kernel:BUG: soft lockup - CPU#17 stuck for 140s! [mysqld:4000]

Message from syslogd@s1 at Jan 16 17:17:27 ...
 kernel:BUG: soft lockup - CPU#22 stuck for 133s! [migration/22:214]

Message from syslogd@s1 at Jan 16 17:17:27 ...
 kernel:BUG: soft lockup - CPU#19 stuck for 151s! [mysqld:4374]

 

Now when this happens we also lose connectivity to our server storage.  The issue is that we're unsure if the storage losing connectivity is causing the CPU errors or the CPU errors are causing the storage connectivity issues.

 

The plan today was for me to spin up several test systems on the new hardware and to work as hard as I can to identify and isolate the cause of the issues.  If we need new networking cards we'll get them.  If we need to change the Operating Systems we'll do it.

 

At the end of the day I do want to apologize deeply for any and all trouble this issue is causing you.  We're not any more happy about it than you are and will have it resolved as quickly as we possibly can.

 

I'm going to do my best to keep this thread up to date so you know what is going on.  I do have an update to post - and will post it in a moment to keep it separate from this one.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#7 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 18 January 2016 - 11:47 AM

The S1 server had another brief outage, however, this time around the OS marked the file system as read-only and we should perform a file system integrity check.

 

Due to the speed of the new hardware this should not take long - but due to the amount of data it could take up to an hour or two.  Ideally it will be done in 10 to 15 minutes.

 

I'll keep this thread updated.  We really need to do this as soon as possible.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#8 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 18 January 2016 - 11:49 AM

The server is online, however, you may have some issues accessing some files through the browser.  If you do - open a ticket and we'll get it touched up for you real quick.

 

We're still going to perform the FSCK but I'm going to try and send out an email to everybody prior so you know what's going on even if you aren't watching this thread.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#9 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 18 January 2016 - 11:52 AM

It looks like a glitch in internal communication has resulted in the FSCK beginning now.

 

We'll keep this updated.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#10 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 18 January 2016 - 11:58 AM

The File System Check is now on Pass 2 - Directory Structure.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#11 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 18 January 2016 - 12:07 PM

It is still in pass 2.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#12 ericr

ericr

    Staff

  • Staff Administrator
  • PipPipPip
  • 224 posts
  • Gender:Male

Posted 18 January 2016 - 12:12 PM

The server is booting up at this time.


  • 0

#13 ericr

ericr

    Staff

  • Staff Administrator
  • PipPipPip
  • 224 posts
  • Gender:Male

Posted 18 January 2016 - 12:15 PM

It is now doing it's boot up defrag and will be online soon.  it is currently 75.2% of the way through the process.


  • 0

#14 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 18 January 2016 - 12:16 PM

And now the system forced another FSCK - this one is just a quick check so it should be up very soon.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#15 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 18 January 2016 - 12:18 PM

The forced check is at 81%


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#16 ericr

ericr

    Staff

  • Staff Administrator
  • PipPipPip
  • 224 posts
  • Gender:Male

Posted 18 January 2016 - 12:23 PM

It is now at 86%


  • 0

#17 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 18 January 2016 - 12:30 PM

It is at 95%.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#18 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 18 January 2016 - 12:31 PM

Completed and rebooting.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#19 MikeDVB

MikeDVB

    Forum Administrator

  • Staff Administrator
  • PipPipPipPipPip
  • 2,900 posts
  • Gender:Male
  • Location:Central Indiana, USA

Posted 18 January 2016 - 12:45 PM

The system has gone into a file system check loop even though it's not repairing anything.

 

We're working on getting it out of this loop.


  • 0
Michael Denney - MDDHosting LLC - Providing Hosting since 2007
Scalable shared hosting plans in the cloud! Check them out!
Highly Available Cloud Shared, Reseller, and VPS
http://www.mddhosting.com/

#20 ericr

ericr

    Staff

  • Staff Administrator
  • PipPipPip
  • 224 posts
  • Gender:Male

Posted 18 January 2016 - 12:52 PM

I appologize for the delay.  after the boot the server became hung again and we needed to resolve the underlying issues.  the issues are resolved and the server is online.


  • 0





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users