As folks noticed on Thursday morning when we opened hotels, the website displayed errors for 5-10 minutes starting at 9 AM EST due to the rush of traffic that came in. This was due to a few factors, which I'd like to go into in technical detail below.
First, some raw stats:
Peak bandwidth: approx 3.6 Megabits/sec
Peak connections: 1,060 concurrent connections
Number of users logged into the site: 100+
Here are some graphs that show just how big those numbers are, compared to normal traffic levels:
For those who saw last year's post about the load on the webserver when we opened up hotels, this year's traffic was about twice what last year's traffic was. I didn't see that coming.
As is plainly visible in the traffic graph, the machine that this website runs on is capabale of much higher bandwidth throughput. So, what happened?
In a word: caching. Or rather, the lack thereof in certain cases.

I finally did it. I bit the bullet and upgraded my site to Drupal 6. Yes, 6.
I had been meaning to upgrade the website YEARS ago, but a number of factors conspired against me, including:
- Waiting on a Drupal 6 port of Acidfree
- Real life $WORK-related things
- Many many real life commitments.
Be that as it may, the upgrade is finally complete, and my website is now running something just a little more modern. :-) I'll roll out more features to the website in the future, including a brand new commenting system. In the meantime, be sure to subscribe to my RSS feed and/or reach out to me on any of the social networks I'm on.
I'll see you out there.
Earlier this weekend, I began work on a new Drupal module: Fivestar Stats.
Right now, it does the following:
I have more features (and screenshots!) planned, but this is just the start.
Fivestar Stats can be downloaded from:
http://github.com/dmuth/fivestarstats
Enjoy!
There has been some increased activity on Pennsylvania Furries website recently, so I decided it justified finally biting the bullet and upgrading the site from Drupal 4.7 to Drupal 6.16. That upgrade was fun and... interesting. But now that I have a much more recent version of Drupal installed, I've been able to add some features such as these:
- Event Calendar. Want to know when upcoming furmeets, gathers, etc. are? We now have a calendar that anyone may post events to.
- Buddy lists. Yes, they are what you think they are. Users can now have an actual "friends" list on this site. Go to any user's page and click the link towards the bottom to become their buddy. (Here's an example)
- Private messages. Just what the name implies. Users on the site can now send each other private messages. When you're logged into the site, just click the "Messages" link to get started.
- Searching. I reindexed the entire site (and set up crontabs to keep that happening) so now the search engine works again, and posts (and users) on this site are fully searchable.
- Tagging posts. All posts can now be tagged. Even posts made by others. Go ahead. Give it a try. There is also a tag cloud that lists the most popular tags on the site.
- User profiles that don't suck. All user profiles now have links to a number of social networks, along with support for forum signatures, private messaging, and buddy lists.
And the URL for the Pennsylvania Furries website?
http://pa-furry.claws-and-paws.com/
Check it out, and let me know what you all think!
One of the neat things about the Drupal CMS is that it has a facility to schedule certain things to be run at regular intervals. For example: sending out subscription notifications or indexing posts for the search function. Those operations can take tens of seconds to complete, so you obviously don't want them executing in the middle of a page load.
The normal way to execute Drupal's crontabs is to load the script cron.php via the web interface. This is a terrible idea, since anyone with an Internet connection could cause your machine to run CPU intensive tasks at well, and bring the webserver to its knees.
An alternative way to run crontabs is with a third-party module called Poormanscron. It's a nice module, but the locking facility it has isn't so hot, and on more than one occasion I've had two cron runs execute in parallel, causing extra load on the machine. In especially bad (read: I/O bound) situations, multiple crontabs get "backed up" onto each other, and that creates huge load spikes and 60-second+ page loads until things calm down again.
There is yet a third way to run Drupal crontabs, and that's via the command line. It gives you maximum control over when and where crontabs run, and this is particularly important when you have multiple Drupal sites running. And I'm going to show you how to do it.
Step 1) Download Drush, the Drupal shell. It's a great little app for accessing your Drupal installation and performing common tasks from the command line. Install Drush (I usually put it in /usr/local/drush/, and make a symlink from /usr/local/bin/drush)
Step 2) Create a script in your home directory, and call it drupal-crontabs.sh. It should look something like this:
#!/bin/sh
#
# Our main directory for holding websites
#
ROOT=/var/wwwexport DRUSH_OPTIONS="-q"
#export DRUSH_OPTIONS="-v" # Debugging#
# Errors are fatal
#
set -ecd $ROOT
#
# All of our drupal installations under $ROOT
#
SITES="furryconnectionnorth.com"
SITES="$SITES saveardmorecoalition.org"
SITES="$SITES claws-and-paws.com"
SITES="$SITES sabanews.org"#
# Loop through our sites to run crontabs on
#
for SITE in $SITES
do
cd $SITE
#
# Disable errors, since sometimes crontabs have issues.
#
set +e
drush $DRUSH_OPTIONS cron
set -e
cd ..
done
Be sure to season the $ROOT and $SITES variables to taste as per your specific installation.
If you run a chmod 755 on this script, you can execute it from the command line. However, you may run into file permission issues if the webserver is running as a different user than you (usually the case).
Step 3) Now you need to set up a crontab to run this script. My preferred way to do it is to create a file called /etc/cron.d/drupal-crontabs and place the following in it:
#
# Run all of our crontabs for Drupal once an hour
#
MAILTO=root
PATH=/usr/bin:/bin:/usr/local/bin15 * * * * www-data /path/to/drupal-crontabs.sh
Make sure you are running these crontabs under the same user as your webserver.
That's it! If you want to see emailed notifications once an hour, just uncomment the line with the "-v" option for Drush. Otherwise, crontabs on all your sites will be run once an hour. Provided they do not take longer than hour to run, all of your search indexes and subscriptions will be kept up to date, and you will no longer have to worry about crontabs stomping over top of each other with Poormanscron. Enjoy!
Earlier today, I performed a major Drupal upgrade on another site that I run. Part of the upgrade involved me installing the Advanced Forum module to bring the forums a little more up to date with other sites that are out there.
Along the way, I learned something interesting: the Author Pane module does NOT display on blog posts.
It looked rather odd when comments on the blog posts had detailed user info, but the post itself did not. So I set out to fix that. I ended up commenting out the line print $picture; in node.tpl.php and instead adding in these lines:
$account = user_load($node->uid);
$template = "advf-author-pane";
$author_pane = theme('author_pane', $account,
advanced_forum_path_to_images(), $template);
print $author_pane;
The code is fairly straightforward. It loads user info on the author of the post, and the theme() function loads the author_pane template, passing in the user data.
Enjoy!
Wow, it's been awhile since I've written here. Real life has had me very busy lately. I've done some neat things though, and I hope to post more about them soon.
The first neat thing I did recently was to roll out some badly needed updates for the user pages on anthrocon.org.
Before, I merely used the default pages that Drupal provided. The problem was that the pages looked a little... bland. Among other things, there were no icons for the various social networking services, and that just wouldn't do. So I read up on how to customize the user profile layout in Drupal and spent a couple of evenings writing some PHP code and making use of Drupal's theming functions.
Here are the old and the new pages side by side. Click on either to get a full page in a separate window.
| Old and busted: | New hotness: | |
|---|---|---|
|
|
The upside of this effort is that when I'm ready to upgrade the Save Ardmore Coalition site to Drupal 6, I can pretty much just copy over my user templates on a wholesale basis, and save myself from having to redo all that work. :-)
While using the nginx webserver with a Drupal installation I noticed something odd. Whenever I submitted a form (with the GET method) or clicked on a link that had a space in it, it would be converted to a plus sign in the URL. Actually, that's not the odd path--the web browser is doing proper URL encoding since spaces are not allowed in URLs.
The weird part is that after the request got to the webserver, and the server processed it and passed it to PHP, and PHP loaded Drupal and executed the PHP code, the plus sign failed to be turned back into a space when using the nginx webserver. Works fine in Apache, but not in nginx.
The symptom is that if I tried to perform a search or edit my user profile, or anything else that involved a plus sign in the GET data, when that data was processed by PHP, the plus sign was still there. i.e., I would see "term1+term2" in the search box, instead of "term1 term2".
So, how to fix this? I could have spent some time reading RFCs, and looking through the source from PHP and nginx, or I could have my site working again. I opted for the latter, and found that this line of code at the very beginning of Drupal's index.php seems to do the trick:
$_GET["q"] = strtr($_GET["q"], "+", " ");
It's silly, and may even be a bit stupid, but it sure does the trick.
When it comes to webhosting, you can't beat NearlyFreeSpeech. They're cheap, fast, and reliable. I've been a very happy customer for years, and run a number of Drupal installations under them.
However, NFSN does do things a little differently, and this can cause some interesting interactions with Drupal. The main thing that they do differently is that they use Squid and run reverse proxies at the edge of their network, which cache requests made to member websites. For most sites, this is not a problem. Drupal, however, tries to be "cache friendly" with regards to the headers it emits, and sometimes this doesn't work so well. I've seen the following symptoms happen under a virgin Drupal installation:
Here's how to fix that:
By examining that line, you can tell whether you are retrieving the most recent copy of the page or not, and this can be a valuable tool in troubleshooting cached pages.
Once all of those things are done, any cache issues that your users are experiencing should slowly go away.
Finally, I must admit that I am not any sort of HTTP guru. I merely followed what I read in these specifications:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.2
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.4
If you know of a better way to fix this problem, feel free to let me know...
The problem
After I brought Anthrocon's room share and ride share forums online, I noticed that last year's posts were still present. This was a problem because people needing rides or rooms for this year's conventions did not notice the date and were replying to those posts, thus wasting everyone's time.
Now, I dislike removing content from any website I manage, since that can potentially hurt Google's PageRank on the site. If only there were some way of removing the old posts from those forums without actually deleting the posts...
Then I remembered that Drupal's database is in third-normal form and came up with this query after about 15 minutes of fiddling:
INSERT INTO temp_nids (
SELECT nid FROM term_node WHERE nid IN (
SELECT n.nid FROM node AS n
LEFT JOIN term_node AS tn ON n.nid=tn.nid
LEFT JOIN term_data AS td ON tn.tid=td.tid
WHERE td.name IN ('Room Share', 'Ride Share')
AND FROM_UNIXTIME(n.created) < '2007-08%'
)
)
DELETE FROM term_node WHERE nid IN (SELECT nid FROM temp_nids);
The innermost query (SELECT n.nid FROM node...) selects node IDs that belong to the Room Share or Ride Share forums with a creation date before August, 2007. The query around that (SELECT nid FROM term_node...) I originally had in to make sure that we got valid node IDs from term_node. Given how the query evolved, that's probably no longer necessary. The outermost query (INSERT INTO temp_nids) stored the matching node IDs in our temporary table for later use.
The final query (DELETE FROM term_node...) deletes the offending node IDs from the term_node table, which is responsible for linking nodes to taxonomy terms.
In other Drupal news, I stumbled across a nice little article the other day called 10 Reasons to Use Drupal CMS. While I knew some of the things mentioned in that article, I had no idea that entities such as The United Nations, Forbes, The Discovery Channel, AOL, and most surprisingly of all--The Grateful Dead.. all use Drupal. Fascinating stuff.
Also, I found the website Drupal Dojo which contains lots of tutorials on how to perform different tasks in Drupal. It looked just like another how-to type site (not that there's anything wrong with that!) until I came to the article on patch rolling and saw this:
Um, yeah... I sure wasn't expecting to see that particular graphic. :-P