Looking back after this morning's stampede, I thought I'd share with folks how the webserver held up, since I know I am not the only geek out there. And, truth be told, I was a bit nervous myself, since I wasn't quite sure just how much traffic we would get and if the webserver would survive, or turn into a smoking crater.
Well, here's what we got:
The first hump is a manual backup I did last night. The second is the automatic backup that runs every morning, where the database and files are rsynced to a machine at another data center. The third hump at 9 AM was when we opened hotel reservations. 1.4 Megabits/sec doesn't look too bad, until you look at:
The 336 simultaneous connections a second was far more interesting. That's about 16 times the normal number of connections to the webserver.
So, what were the effects? Let's look at MySQL first:
There were nearly 1000 queries per second, 500 of which were in the cache. Between that and the other queries which are not hitting the MySQL cache, there's definite room for improvement. But before I look at the RAM situation, let's look at the CPU usage:
One of the cores was close to 100%, but there was virtually no I/O wait. The load average was also good to see--it was actually less than the load during the nightly backups. From a performance standpoint, both of these graphs look very good, as it means the disk was not the bottleneck. But why not? Well, here's the final piece of the puzzle:
The RAM usage is what ties all of the other graphs together. By keeping the memory usage near constant, I was able to avoid hitting swap space, which would have incurred a huge performance penalty and quite possibly a "death spiral".
How did I keep RAM usage so low? Instead of running the Apache webserver, which requires a separate process for each connection, I instead ran the Nginx webserver. Unlike Apache, it uses asynchronous I/O to handle incoming requests. This approach scales much better than Apache, which creates a separate child process for each "listner" and chews up a lot of memory.
For comparison, the number of simultaneous connections peaked at "only" around 100 during last year's convention. We broke the old record by a factor 3.
"And what we have learned?"
Even under the highest load to date, we were in no danger of running out of RAM. This means that I can (and probably should) allocate more memory to MySQL so that more queries are cached and overall performance is increased even more. There are also some more advanced caching modules that I intend to research, to see if we can cache straight off of the filesystem and avoid the database altogether. More on that as it happens.