On the Virtue of Laziness in Software Engineering

Many years ago, I recall reading in an O’Relly book which stated that when it comes to programming, “laziness” is considered a virtue.

Be lazy, like this cheetah!

That may seem like a strange thing to utter, but hear me out. When working in software engineering, you will find yourself doing the same thing over and over. It can be tedious and mind-numbing, and if it’s the sort of thing that involves multiple steps, can increase the risk of human error. For example, one case of human error cost a company millions of dollars and ultimately tanked that company.

This means that the more things that are automated or at least semi-automated, the better. There will be less manual steps to run, and less things that can go wrong because a step was missed or not executed properly. Conversely, because automation means the same thing is done over and over, you’ll get repeatable builds which make things like troubleshooting, multi-tenancy, and disaster recovery easier to perform.

Continue reading “On the Virtue of Laziness in Software Engineering”

Using Splunk on Hotel Internet

Splunk> Finding your faults, just like Mom.

In a previous post, I wrote about using Splunk to monitor network health. While useful for home and office use, there’s another valuable use for this app, and that’s when traveling.

In my case, over my Christmas vacation, I checked into a Mom and Pop hotel, or rather a motel! It was about 24 rooms all in a row, occupying a single floor. Since they were on a budget, their Internet offering consisted of what appeared to be 5 or 6 Linksys routers set up every few rooms. You’d simply connect to the closest access point and have Internet.

But there was a problem: determining which access point was closest to me! The signal strength indicator on my computer showed several of them were 3/3 bars so that wasn’t much help. I tried connecting to the first one, but had virtually no Internet connectivity.

That’s when I fired up Splunk:

docker run --name splunk-network-health-check \
   -p 8000:8000 \
   -v $(pwd)/splunk-data:/opt/splunk/var/lib/splunk/defaultdb \
   -e "TARGETS= google.com" \

In the above example, I added in the TARGETS environment variable, and was sure to include, which was the IP for each router (they were all the same). Then I set Splunk “real-time mode” and periodically checked that tab as I was working. This is what I saw:

Testing 3 separate hotel Access Points with Splunk
Continue reading “Using Splunk on Hotel Internet”

Introducing: Splunk Lab!

Splunk> Australian for grep.

In a previous post, I wrote about using Splunk to monitor network health and connectivity. While building that project, I thought it would be nice if I could build a more generic application which could be used to perform ad hoc data analysis on pre-existing data without having to go through a complicated process each time I wanted to do some analytics.

So I built Splunk Lab! It is a Dockerized version of Splunk which, when started, will automatically ingest entire directories of logs. Furthermore, if started with the proper configuration, any dashboards or field extractions which are created will persist after the container is terminated, which means they can be used again in the future.

A typical use case for me has been to run this on my webserver to go through my logs on a particularly busy day and see what hosts or pages are generating the most traffic. I’ve also used this when a spambot starts hitting my website for invalid URLs.

So let’s just jump right in with an example:

docker run -p 8000:8000 \
   -v $(pwd)/data:/data \
   -v /var/log/nginx/:/logs \
   -v $(pwd)/app:/app \
   -e SPLUNK_PASSWORD=password \

This will download the container, start it up, and mount the appropriate directories. The containerized version of Splunk looks recursively for logs in /logs/, stores its data in /data/, and stores dashboards that are created in /app/. (Note that if you try to use “password” as your password, the container will refuse to start for safety reasons!)

First things first, let’s verify our data was loaded and do some field extractions!

Continue reading “Introducing: Splunk Lab!”

Using Splunk to Monitor Network Health

Splunk> Winning the War on Error

I’ve been using Splunk professionally over the last several years, and I’ve become a big fan of using it for my data processing needs. Splunk is very very good about ingesting just about any kind of event data, optionally extracting fields at search time, and providing tools to graph that data, find trends, and see what is really happening on your platform. This is important when your platform consists of thousands of servers, as it does at my day job!

While Splunk can handle events in timestamp key=value key2=value2 format, it also has support for dozens of standardized formats such as syslog, Apache logs, etc. If your data is in a customized format, no problem! Splunk can extract that data at either index or search time! Finally, there’s the Search Processing Language, which is like SQL but for your event data. With SPL, you can run queries, generate graphs, and combine them all programatically.

So yeah, I’m a huge fan of Splunk. One thing I use it for out of the of office is to graph the health of my Internet connection. This is useful both for when I’m at home and when I am traveling–I just feed the output of ping into Splunk and then I can get graphs of packet loss and network latency.

Let’s just jump into an example screen–here’s what I saw when I was a friend’s place and I uploaded a video to YouTube:

Continue reading “Using Splunk to Monitor Network Health”

ssh-to: Easily manage dozens or hundreds of machines with SSH

Hey software engineers! Do you manage servers? Lots of servers? Hate copying and pasting hostnames and IP addresses? Need a way to execute a command on each of a group of servers that you manage?

I developed an app which can help with those things, and my employer has graciously given me permission to open source it.

First, here’s the link:


And here’s how to download a copy:

git clone https://github.com/Comcast/ssh-to.git
Continue reading “ssh-to: Easily manage dozens or hundreds of machines with SSH”

Two New Open Source Projects

At my day job, I get to write a bit of code. I’m fortunate that my employer is pretty cool about letting us open source what we write, so I’m happy to announce that two of my projects have been open sourced!

The first project is an app which I wrote in PHP, it can be used to compare an arbitrary number of .ini files on a logical basis. What this means is that if you have ini files with similar contents, but the stanzas and key/value pairs are all mixed up, this utility will read in all of the .ini files that you specify, put the stanzas and their keys and values into well defined data structures, perform comparisons, and let you know what the differences are. (if any) In production, we used this to compare configuration files for Splunk from several different installations that we wanted to consolidate. Given that we had dozens of files, some having hundreds of lines, this utility saved us hours of effort and eliminated the possibility of human error. It can be found at:


Continue reading “Two New Open Source Projects”

So I Wrote A Craps Simulator

Work is sending me to a conference that just happens to be hosted in Las Vegas, a city where there are a few casinos. I’m not much for gambling, so I figured I should learn a little about it before I even think of doing such a thing. I read that craps is a fun game that has some pretty safe bets, so I decided to learn more about that. To that end, I wrote a craps simulator.

Continue reading “So I Wrote A Craps Simulator”

Git 101: How to Handle Merge Conflicts

In the last post, I talked about how to create a Git repository and upload it to GitHub. In this post, I’m going to talk about how to resolve Git conflicts.

Setting Up Our Environment

First, we’re going to create a Git Repository for the user Doug. Since I already covered that in the last post, I’m bring to breeze through those steps below:

$ mkdir doug
$ cd doug
$ git init
Initialized empty Git repository in /path/to/doug/.git/
$ touch README.md
$ git add README.md 
$ git commit -m "First commit"
[master (root-commit) d86a7e2] First commit
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 README.md

At this point, we have a repository created for the user Doug. Now I’m going to clone that repository for the user Andrew:

Continue reading “Git 101: How to Handle Merge Conflicts”