Michael D. Posted February 19, 2013 Report Share Posted February 19, 2013 Update - 02/20/2013 - 9:40 PM ETThis issue is now resolved. ============ On Monday, February 18th at approximately 6:20 AM ET the Icarus server crashed with a Kernel Panic (i.e. when something goes wrong with the operating system and it 'gives up' and crashes). This sort of situation is out of our control beyond which version of the system kernel we boot into. One would think the latest 'stable' version would be stable but more often than not we've resolved one problem and found two new ones during kernel upgrades. Unfortunately kernel upgrades are required not only for security purposes but also to resolve issues that can cause the server to lock up unexpectedly as it did on this morning. During the process of troubleshooting the panic we decided to upgrade to a newer version of the kernel. This morning at approximately 2:50 AM ET all was well until our R1Soft backups decided to run at which point the server hang up and quit responding forcing a reset. We believe this to be a Kernel issue, but possibly it's a R1Soft issue. We're going to be working with both software vendors to try and find the source of the issue. If you have any questions at all, just let us know. Quote Link to comment Share on other sites More sharing options...
Dean Posted February 19, 2013 Report Share Posted February 19, 2013 Hi Michael, From your best knowledge, How many server drops are we likely to expect? Regards Quote Link to comment Share on other sites More sharing options...
Scott Posted February 19, 2013 Report Share Posted February 19, 2013 From your best knowledge, How many server drops are we likely to expect? Given the nature of this issue, there is no estimate for this. Quote Link to comment Share on other sites More sharing options...
Dean Posted February 19, 2013 Report Share Posted February 19, 2013 OK thank you for your response! =D Quote Link to comment Share on other sites More sharing options...
Michael D. Posted February 19, 2013 Author Report Share Posted February 19, 2013 We don't expect any unexpected instability at this time as we've disabled the backup system for the server temporarily. We're going to work with our vendors to try and resolve the issue prior to re-enabling backups. When we are prepared to give it a try, we'll update this post prior. Quote Link to comment Share on other sites More sharing options...
Michael D. Posted February 21, 2013 Author Report Share Posted February 21, 2013 Within the next few hours we are going to attempt starting the R1Soft backup process for this server. We're going to have a technician sitting at the system and logged in so that if the system decides to lock up again we should be able to kill the backup without rebooting the server. Quote Link to comment Share on other sites More sharing options...
Michael D. Posted February 21, 2013 Author Report Share Posted February 21, 2013 The backup started immediately instead of on it's queue, and it did indeed take the server offline again. It's in the process of rebooting. Quote Link to comment Share on other sites More sharing options...
Michael D. Posted February 21, 2013 Author Report Share Posted February 21, 2013 The server is back online. Quote Link to comment Share on other sites More sharing options...
Michael D. Posted February 21, 2013 Author Report Share Posted February 21, 2013 After a couple of reboots, we've gotten onto a kernel that is known-good (we have other servers running it and backing up nightly without issues). We're going to mark this issue as resolved. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.