Awhile ago, I found myself trying to make a decision on which of several restaurants to eat at. They were all highly rated in Yelp, but surely there might be more insights I could pull from their reviews. So I decided to Splunk them!
Yelp has an API but, I am sorry to say that it is awful. It will only let you download 3 reviews for any venue. That’s it! What a crime.
So… I had to crawl Yelp venue pages to get reviews. I am not proud of this, but I was left with no other other option.
Python has been my go-to language lately, so I decided to solve the problem of review acquisition with Python. I used the Requests module to fetch the HTML code, and the Beautiful Soup module to extract reviews and page links from the HTML.
I recently noticed that something was using up lots of RAM on my Mac, as it would periodically slow down. I had some suspects, but rather than regularly checking in Activity Monitor, I thought it would be more helpful if I had a way to monitor usage of RAM by various processes over time.
Due to previous success with my Splunk Lab app, I decided to use it as the basis for building out a RAM monitoring app. The data acquisition part, however, was trickier. The output of the UNIX ps app isn’t very structured, and I had some problems parsing that data, especially in situations where there were spaces in filenames and arguments to those commands.
So I wrote a replacement for PS. It turns out that Python has a module called psutil, which lets you programmatically examine the process tree on your Mac. I ended up writing an app called Better PS, and it writes highly structured data on each current process to disk, which is then ingested by Splunk.
I’ve written about Docker before, as I am a big fan of it. And for this post, I’m going to talk about some practical situations in which I’ve used Docker in real life, both for testing and software development!
But first, let’s recap what Docker IS and IS NOT:
Docker containers spin up quickly (1-2 seconds or less)
Docker containers DO have separated process tress and filesystems
Docker containers ARE NOT virtual machines
Docker containers ARE intended to be ephemeral. (short-lived)
You CAN, however, mount filesystems from the host machine into Docker, so those files can live on after the container shuts down (or is killed).
You SHOULD only run one service per Docker container.
Everybody got that? Good. Now, let’s get into some real life things I’ve used Docker for.
Experimenting in Linux
Want to test out some commands or maybe a shell script that you’re worried might be destructive? No worries, try it in a Docker container, and if you nuke the filesystem, there will be no long-term consequences.
# Start a container with Alpine Linux
$ docker run -it alpine
# Let's do something dumb
$ rm /bin/ls
$ ls -l
/bin/sh: ls: not found
# Just exit the container, restart it, and our filesystem is back!
/ # exit
[unifi:~/tmp ] $ docker run -it alpine
/ # ls
bin dev etc home lib media mnt proc root run sbin srv sys tmp usr var
And all of the above takes just a couple of seconds! This works with other Linux distros as well, such as CentOS and Unbuntu–just change your Docker command accordingly:
docker run -it centos
docker run -it ubuntu
Yes, that means you could run CentOS in a container under Ubuntu or vice-versa. Docker doesn’t care. 🙂
One of my activities outside of the office consists of staffing furry conventions. One of those conventions is Anthrocon, a furry convention held in downtown Pittsburgh every June/July. At that particular convention, I manage the website and their social media properties.
Yesterday, we opened general hotel reservations, and that resulted in a huge rush of members booking hotel rooms. 1,000 rooms were booked in the first 15 minutes! This was completely expected, and we kept track of how things played out on social media, and also took a survey of members who booked hotel rooms to see how things went. In this post, we’re going to share what we learned based on those survey results and Twitter activity.
First, did people who booked a hotel room get the hotel that they wanted?
For nearly 70% of you, the answer is yes. This makes us happy, but we would like to see the number higher—ideally 100% of our attendees would get a room in the hotel of their choice. This is something we continue to work on each year by adding new hotels and getting bigger room blocks in existing hotels.
So I’m a huge fan of the service NodePing. NodePing is a service used to monitor websites and service availability, and can ping hosts, monitor HTTP/HTTPS, other services like POP3/IMAP, DNS, and more! It can also perform “advanced HTTP” monitoring and check the HTTP response code or the content from the response! I pretty much use NodePing to monitor all of my hobbyist projects, as well as those belonging to friends.
One thing that gets tricky, however, is how to do alerting. NodePing lets you do email and text notifications, but neither feels “right” to me, especially if you want to alert multiple people at once. So I came up with a better way: sending webhooks into Slack! In this post, I am going to walk you through the process of making this happen.
First, you’ll need to purchase a plan on NodePing. Plans on NodePing start at $8/month, but I personally recommend the $15/month plan as you can monitor up to 200(!) different services with it. You’ll also need to create your own Slack instance, and Slack has a free tier, which I recommend.
After creating a Slack instance, I recommend downloading and configuring both the Desktop and mobile clients to connect to your Slack instance.
Setting Up A Webhook In Slack and NodePing
Now that you’re signed up with both services, you’ll need to create a webhook in Slack. To do that, go to the “Applications” page on Slack’s website and choose the “Incoming Webhooks” app. Add a new integration and copy the URL of the webhook into your clipboard:
Note that whichever Slack channel you send alerts to is completely up to you. My personal recommendation is to create a separate channel just for alerts from NodePing.
When I made the move to WordPress a few weeks ago I had a lot to learn, both in terms of functionality that WordPress had to offer, as well as plugins that I could install and which of those plugins actually worked well!
So I’m going to spend this post sharing what plugins I found the most useful so that anyone else who is getting into WordPress can have an easier time getting started.
Even if you don’t use Facebook or Twitter, chances are that your visitors do and they share your content on those sites. So this plugin is probably the most important plugin of the entire list, because it adds the appropriate meta tags to ensure that when your content is shared on either service, it is rendered correctly.
Furthermore, the Open Graph plugin allows you to set a default image and override it with other an image from the post itself or one uploaded separately:
Again, I cannot stress it enough–if you want your content to look presentable on social media sites, you need to use this plugin. Otherwise, you are passing up a huge opportunity.
One of the neat things about WordPress is that when you upload an image and then include that image in a blog post, you can decide where that image links to. The image can link to nothing at all, the raw image, or an “attachment page” which contains that image and a caption.
That said, something that has caused me grief for out of the box WordPress builds has been the image on the media page being really small. Take for example, this picture of a freeloading cheetah. When I upload the picture, the attachment page looks like this:
Just look at that. A tiny image and a bunch of the page being completely unused. Disgraceful. Surely we can do better!
As it turns out, tweaking a single line of code can be used to change the size of all images on media pages.
In my case, over my Christmas vacation, I checked into a Mom and Pop hotel, or rather a motel! It was about 24 rooms all in a row, occupying a single floor. Since they were on a budget, their Internet offering consisted of what appeared to be 5 or 6 Linksys routers set up every few rooms. You’d simply connect to the closest access point and have Internet.
But there was a problem: determining which access point was closest to me! The signal strength indicator on my computer showed several of them were 3/3 bars so that wasn’t much help. I tried connecting to the first one, but had virtually no Internet connectivity.
Running that command will print up a confirmation screen so that you can back out and change any options (such as hosts to ping), and when you’re ready, just hit <ENTER> to start the container.
In the above example, I added in the TARGETS environment variable, and was sure to include 192.168.1.1, which was the IP for each router (they were all the same). Then I set Splunk “real-time mode” and periodically checked that tab as I was working. This is what I saw:
In a previous post, I wrote about using Splunk to monitor network health and connectivity. While building that project, I thought it would be nice if I could build a more generic application which could be used to perform ad hoc data analysis on pre-existing data without having to go through a complicated process each time I wanted to do some analytics.
So I built Splunk Lab! It is a Dockerized version of Splunk which, when started, will automatically ingest entire directories of logs. Furthermore, if started with the proper configuration, any dashboards or field extractions which are created will persist after the container is terminated, which means they can be used again in the future.
A typical use case for me has been to run this on my webserver to go through my logs on a particularly busy day and see what hosts or pages are generating the most traffic. I’ve also used this when a spambot starts hitting my website for invalid URLs.
This will print a confirmation screen where you can back out to modify options. By default, logs are read from logs/, config files and dashboards are stored in app/, and data that Splunk ingests is written to data/.
Once the container is running, you will be able to access it at https://localhost:8000/ with the username “admin” and the password that you specified at startup.
First things first, let’s verify our data was loaded and do some field extractions!