Git is a very powerful revision control system used in software development, and at this point, is effectively the industry standard. One of the things that makes Git so powerful is that there are all sorts of low-level operations that developers can do in it. This includes everything from tagging releases, to having an insane number of branches, to painless merging of said branches, to rewriting history.
If you’re new to Git, your brain probably threw an exception when reaching the end of the paragraph as you thought to yourself, “Wait WHAT? Why would you want to REWRITE HISTORY?” Good question! Normally, you don’t want to mess history in a revision control system. But sometimes you may want to or even need to. A few possible reasons for rewriting history include:
Feature Branch “B” is a branch of Feature Branch “A” and you want to merge B’s changes to master, but not A’s.
Squashing all the commits in a feature branch to a single commit, after having previously pushed those commits.
A developer made a commit with something that doesn’t belong in Git, such as PII or credentials.
You want to “clean up” the commit history a little.
Whether any of the above are fully valid reasons or not is something up to you and your dev team.
That said, rebasing is not something you want to experiment with for the first time on your production repo. It’s just a bad idea. So I built a playground/lab where anyone can experiment with git rebasing in an isolated and local environment, which can be stood up in seconds. Here’s how to get started:
git clone firstname.lastname@example.org:dmuth/git-rebase-i-playground.git
Perhaps you’re worried about being doxxed, perhaps you’ve received some specific threats, maybe you just want to increase your security. No matter the reason, this article is for you! Below I will list a collection of good practices to keep you and your accounts safe online. I fully expect to update this post as things change in the future.
I have tried to put things in a logical order, with some later steps depending on earlier steps, and some things that may be considered “controversial” towards the end.
This post was last updated on Jan 2, 2020.
Let’s start with passwords. I shouldn’t have to say this, but I will do so anyway: do not reuse passwords. Reusing passwords means that if a single account provider is breached and your plaintext password is recovered, you now have additional accounts at risk of compromise. This has happened before.
I recommend using a password manager such as LastPass to keep track of your passwords. While having your passwords stored in an app that uploads them somewhere increases your risk slightly, I feel it is outweighed by using a different password for each service. For passwords themselves, you can use random characters or a system such as Diceware to create long passwords that are easier to remember. While the latter is slightly less secure, a password that can be remembered is one less password to store into a password manager.
Anyway, once you have enabled Untrusted Shortcuts, you can click on any of these links to add shortcuts to Siri that will query the website’s API and report back on the status of Regional Rail, Busses, or both:
If you’re like me, you write a fair bit of a code, which means you have to interact with many Git repositories. If you’re also like me, chances are you have them in a directory called development/ or similar. It might even have some nested directories, something like this:
So that’s cool, but let’s say that you get a new machine and you want replicate your development/ directory structure onto it? One way is to check out everything by hand, but that’s laborious and time consuming. A second way is to keep backups–and you should absolutely do this–but aside from challenges of restoring a single directory out of an entire archive, what if that backup doesn’t have the latest commits in it?
I can now offer a third way. I recently wrote a couple of scripts available on GitHub that can be used to extract Git remote from each repo in an entire directory stucture, and save those remotes and the directories they belong in to a file. Given the above example, it might look something like this:
That may seem like a strange thing to utter, but hear me out. When working in software engineering, you will find yourself doing the same thing over and over. It can be tedious and mind-numbing, and if it’s the sort of thing that involves multiple steps, can increase the risk of human error. For example, one case of human error cost a company millions of dollars and ultimately tanked that company.
This means that the more things that are automated or at least semi-automated, the better. There will be less manual steps to run, and less things that can go wrong because a step was missed or not executed properly. Conversely, because automation means the same thing is done over and over, you’ll get repeatable builds which make things like troubleshooting, multi-tenancy, and disaster recovery easier to perform.
In my case, over my Christmas vacation, I checked into a Mom and Pop hotel, or rather a motel! It was about 24 rooms all in a row, occupying a single floor. Since they were on a budget, their Internet offering consisted of what appeared to be 5 or 6 Linksys routers set up every few rooms. You’d simply connect to the closest access point and have Internet.
But there was a problem: determining which access point was closest to me! The signal strength indicator on my computer showed several of them were 3/3 bars so that wasn’t much help. I tried connecting to the first one, but had virtually no Internet connectivity.
Running that command will print up a confirmation screen so that you can back out and change any options (such as hosts to ping), and when you’re ready, just hit <ENTER> to start the container.
In the above example, I added in the TARGETS environment variable, and was sure to include 192.168.1.1, which was the IP for each router (they were all the same). Then I set Splunk “real-time mode” and periodically checked that tab as I was working. This is what I saw:
In a previous post, I wrote about using Splunk to monitor network health and connectivity. While building that project, I thought it would be nice if I could build a more generic application which could be used to perform ad hoc data analysis on pre-existing data without having to go through a complicated process each time I wanted to do some analytics.
So I built Splunk Lab! It is a Dockerized version of Splunk which, when started, will automatically ingest entire directories of logs. Furthermore, if started with the proper configuration, any dashboards or field extractions which are created will persist after the container is terminated, which means they can be used again in the future.
A typical use case for me has been to run this on my webserver to go through my logs on a particularly busy day and see what hosts or pages are generating the most traffic. I’ve also used this when a spambot starts hitting my website for invalid URLs.
This will print a confirmation screen where you can back out to modify options. By default, logs are read from logs/, config files and dashboards are stored in app/, and data that Splunk ingests is written to data/.
Once the container is running, you will be able to access it at https://localhost:8000/ with the username “admin” and the password that you specified at startup.
First things first, let’s verify our data was loaded and do some field extractions!
I’ve been using Splunk professionally over the last several years, and I’ve become a big fan of using it for my data processing needs. Splunk is very very good about ingesting just about any kind of event data, optionally extracting fields at search time, and providing tools to graph that data, find trends, and see what is really happening on your platform. This is important when your platform consists of thousands of servers, as it does at my day job!
While Splunk can handle events in timestamp key=value key2=value2 format, it also has support for dozens of standardized formats such as syslog, Apache logs, etc. If your data is in a customized format, no problem! Splunk can extract that data at either index or search time! Finally, there’s the Search Processing Language, which is like SQL but for your event data. With SPL, you can run queries, generate graphs, and combine them all programatically.
So yeah, I’m a huge fan of Splunk. One thing I use it for out of the of office is to graph the health of my Internet connection. This is useful both for when I’m at home and when I am traveling–I just feed the output of ping into Splunk and then I can get graphs of packet loss and network latency.
Let’s just jump into an example screen–here’s what I saw when I was a friend’s place and I uploaded a video to YouTube:
At my day job, I get to write a bit of code. I’m fortunate that my employer is pretty cool about letting us open source what we write, so I’m happy to announce that two of my projects have been open sourced!
The first project is an app which I wrote in PHP, it can be used to compare an arbitrary number of .ini files on a logical basis. What this means is that if you have ini files with similar contents, but the stanzas and key/value pairs are all mixed up, this utility will read in all of the .ini files that you specify, put the stanzas and their keys and values into well defined data structures, perform comparisons, and let you know what the differences are. (if any) In production, we used this to compare configuration files for Splunk from several different installations that we wanted to consolidate. Given that we had dozens of files, some having hundreds of lines, this utility saved us hours of effort and eliminated the possibility of human error. It can be found at: