As many of you have pointed out, one of our servers had been performing extremely slow for about the last month. I’m happy to say that the issue has been found and resolved. For those of you that have been asking what the issue was, I want to take a few minutes to address your question and to share with you the steps we took to locate the issue and how to correct it. Although I’m not going to go into all of the details in this article, I will try my best to explain the majority of what went on. Plus, if any of you have any more specific questions regarding anything in this article, feel free to ask them in the comments below and I’ll answer you the best I can.
Yes, the rumors are true. One of our servers was recently compromised. Apparently, unbeknown to me, one of our co-located servers was running an unpatched FTP demon. Firstly, I know how unsecure FTP is. Secondly, I don’t like running anything unpatched or out of date. So, let’s just say I was a little more than pissed to find out that our network admin had been slacking on his job. Even though we’re getting out of the web hosting business, I still expect people to do the work I am paying them for. But, that’s for another article.
About a month ago, we started getting reports that one of our load balanced servers was taking upwards of 60 seconds to respond. So, the first thing I did (after contacting my network guy and finding out he hasn’t been keeping up the servers as he was suppose to be) was to ping the server from a remote location. The ping seemed to be responding as expected. The next thing I did was to bring up a web page hosted on that server. Unfortunately, the reports were correct. It took more than 60 seconds for the page to load. So, I made a secured connection remotely to the box and began checking to see where the hangup was. When I connected via SSH, the response time was also slow. But, I finally made a connection and began the troubleshooting process.
The first command I issued was “ps -aux”. I wanted to see all the processes running on that machine and who was running them. However, the results didn’t indicate that anything was running unexpectedly. So, I next ran the “top” tool. Again, everything looked good. The CPU and memory utilization was “normal”, swap wasn’t being used at all, and there were no rogue programs running.
After seeing that there was nothing running on the machine that shouldn’t have been, I suspected it might have been a hardware issue. Luckily, this particular server is at a location that is within my driving distance. So, I loaded up and drove to the data center. When I got there, I noticed that one of my network cards wasn’t responding. Could this have been the problem? With a little bit of “ifconfig” magic, I got the card back online. Once it was, I tested the server again, but the problem still existed.
Still thinking this might be a hardware issue, I disconnected the ethernet cables from both network cards and tried to SSH back into the machine. When connecting on localhost, everything was fine. But, when I tried using the IP address of either of the network cards, the 60 second delay once again showed its ugly face. While the ethernet cables were both pulled, I decided to try and ping an outside server. For some reason, I got back a successful ping response. How could this be? Neither card was plugged in.
Again, I ran “top” & “ps”, but found nothing. I went digging thru all of my log files and still couldn’t find anything out of the norm. Well, this is when I began realizing it wasn’t hardware related, but instead was a possible compromise. To test that theory, I issued the “yum” command to install “htop”. After “htop” was installed, I noticed all kinds of processes running that “top” wasn’t reporting. For example, htop report “ttyload” and “ttymon” when top didn’t. After a little bit of research, I found that these files were part of the SHV4 and SHV5 rootkits. These rootkits were responsible for replacing several of my system applications such as top, ifconfig, ps, netstat, lsof, etc…
Now that I knew what the problem was, I knew what to do to fix it. To begin with, I had to use “htop” to kill the “ttyload” and “ttymon” processes. Next, I needed to get rid of them from the file system. Before doing that, I installed a sweet little program called “rkhunter” that when ran, verified that the SHV4 and SHV5 rootkits were indeed installed on my computer and also informed me of several other issues I needed to resolve. I also had to reinstall netstat and most of the other apps so that I could do a “real” check to make sure there were no other rogue processes running.
Both of the rootkits that were installed had opened ports for listening. They also created backups of some of my files and created new files for “/usr/lib/libsh/hide”, “/usr/lib/libsh/.backup”, “/usr/lib/libsh/.sniff”, “/lib/libsh.so/sshk”, “/lib/libsh.so/shdcf”, “/usr/sbin/ttyload”, and a few others. However, when I tried to “ls -la” those files / folders, none of them showed up. This was because they were flagged as immutable and hidden. In order to “see” them, I had to issue an “lsattr” command on the parent folder. Once the files appeared, I could issue commands like “chattr -sia /bin/ls” to unhide them and to remove the immutable flag.
Anyways, after an extensive amount of cleaning and reinstalling apps, I finally got everything cleaned up. However, even after the rootkits had been removed, Apache still seemed to be responding slowly. With a little more digging, I found that my “/etc/resolv.conf” file had a bogus nameserver IP address listed in the first position and this file too was flagged as immutable. After removing the immutable flag and bogus IP address, everything went back to normal.
Now, I know that everyone in the forums suggest wiping out the entire box and starting over, but that wasn’t a possibility for me in this situation. So, I had to take the manual steps required to “fix the glitch” myself. Besides, I learned a lot of new tricks and found some really useful tools such as rkhunter and chrootkit which will come in handy if anything like this ever happens again (and from experience of owning a web hosting company know it will).
PayPal will open in a new tab.