Every once in awhile one of my websites gets hit by some poorly behaved web crawler or by some spambot. The symptoms are usually the same: either the same page gets loaded over and over, or non-sensical URLs based on valid URLs are hit. Either way, the effect on the web server is debilitating--the exact traffic causes the database to take a hit, and the network usage to be 5-10 times what is normal for the website. We can't have that.
My usual method for mitigating this is to make use of UNIX command line utilities such as tail, cut, and grep to weed out the offender and then block them with iptables. But I got tired of cobbling together the same commands every time this happened. Instead, I decided to write nginx top!
Here's how to install it:
npm install -g nginxtop
If instead you wanted the source code:
email@example.com:dmuth/nginxtop.git cd nginxtop
Once installed, here's sample usage:
tail -f /var/log/nginx/access.log | nginxtop [ -n num_hosts_to_print] [-i report_interval_in_seconds]
The specified number of hosts will be printed up every specified number of seconds, with the hosts which have the highest number of requests at the top.
If you checked out the source code, here's how to test:
tail -fn100 test.log | ./nginxtop.js -n 5
Once every second, you'll see output like this:
Nginxtop Top Hosts =========================================== 127.0.0.10: 14 hits 127.0.0.1: 7 hits 127.0.0.2: 6 hits 127.0.0.3: 6 hits 127.0.0.4: 5 hits
That's it for Nginxtop! Feel free to reach out with any questions or comments.
If you read this far, you should probably follow me on Twitter.