Welcome to my website and blog! It has been been around in one form or another for well over a decade. It’s changed purposes a few times, and at this point is now a mostly tech blog where I share things I’ve learned or made. Feel free to have a look around, or consider checking out some of the more popular posts I’ve made over the years:
If you’re been reading this blog for awhile, you’ll know that I’m a big fan of Splunk, and I even went so far as to Dockerize it for use in a lab/testing environment.
Well today I want to talk about a command in Splunk which I believe is seriously underrated: makeresults.
Makeresults (documented here) lets you generate fake events for testing purposes. No indexes are queried, no disks are touched, which means that makes results is very very fast. And when a query runs quickly, that means you can run it more times which means new queries and content will be developed faster.
In this post, I’m going to walk you through a way to use makeresults to learn the difference between the streamstats and eventstats commands.
Google Drive is one of my favorite apps for storing and editing documents and spreadsheets. If don’t currently use Google Drive in place of Microsoft Office, I would recommend checking it out!
That said, while it’s a useful tool, your files are being stored on somebody else’s computer, which means that if your Google account should get hacked or suspended, you will lose access to your files. Not good.
In this post, I will show you how to back up the contents of your Google Drive onto your filesystem. You will need a medium level of knowledge and some experience with the command line for this.
Installing and Configuring Rclone
First, start by downloading Rclone. Rclone is a command line app for managing, copying, and syncing files across over 40 different cloud providers. In addition to Google Drive, it has support for Dropbox, AWS S3, Microsoft OneDrive, and a whole list of cloud providers that I’ve never even heard of!
Once you have Rclone downloaded, start up its configuration wizard by typing:
That’s where Prometheus, Loki, and Grafana all come in. Prometheus is a time series database built for storing metrics. Loki is a log collection system which scales horizontally and is useful for collecting application logs, and Grafana is the dashboard app which is used to view metrics from either platform!
I wanted to learn more about each of these apps, and I figured the best way to do so was to build out something in Docker that let me ingest data immediately, and then to build some sample dashboards on top of that. I then open sourced it, and the entire project can be found at https://github.com/dmuth/grafana-playground
First, clone the repo and start up all of the Docker containers:
git clone https://github.com/dmuth/grafana-playground.git
docker-compose up -d
This will start up several containers, some of which will ingest data, some of which will store data.
I’ve seen complaints pop up on Twitter that people are getting their accounts suspended over years old tweets that happen to contain copyrighted music. So let’s say that, like me, you have a Twitter account over 10 years old and you want to go through your old tweets so you can pull any such video before Twitter does — how do you go about doing that?
Well here’s the thing: the UNIX command line is incredibly powerful if you know how to use it. In this post, I’ll show you how to use the bash shell in Linux or Mac OS/X to find those videos so that you can remove them.
The first thing you gotta do is download your entire Twitter archive. There are instructions on how to do that here. Once you put in the request, you’ll hear back from Twitter within a day when the download is ready. Expect the file to be rather large — in my case it was over 2 Gigs. Download that file and unzip it.
At the time of this writing, all of your media will be found in the folder data/tweet_media/, so cd into that directory and see how many files there are:
In Part 1, I wrote about how to get your data out of Evernote and into Obsidian. In this post, I’m going to cover how to get the most out of Obsidian in terms of functionality.
Organizing in Obsidian
At a high level, I like to use The PARA Method, which consists of 4 high-level folders for storing your notes. Those folders are:
Projects: Projects are things you are actively researching or working on, such as this blog post. They have deliverables and they have deadlines. Notes should not exist in your project folder forever, but instead be moved into another folder.
Areas of Responsibility: The literal definition that I’ve seen elsewhere is “activity with a standard to be maintained over time”. If you’re using Obsidian for work, it might be for platforms which you own and perform occasional maintenance on, runbooks for dealing with specific issues, etc. If you’re using Obsidian for personal use, it might be for taking notes from books you read, notes about your health, your car, finances, etc.
Resources: A resource is defined as “a topic or theme of ongoing interest”. This might be things like recipes, ideas for home improvement, and the like. I will be the first to admit that sometimes the line blurs between Resources and Areas. The best advice I can offer is to try not to sweat the details here.
Archives: Stuff you’re not using anymore, such as projects you’ve finished or travel plans for trips taken. I recommend ZIPing the folders that hang out in this directory. (Obsidian won’t mind)
In this blog post, I’m going to talk about why I moved 3,000 Evernotes from Evernote to Obsidian, and walk through the process of doing so with the help of a third-party open source tool, plus some code I wrote to drive that tool.
Breaking Up With Evernote
I used to love Evernote. I started using it back in 2012 or so, and before long I got the paid version! It met a lot of needs I had at the time, such as being able to take notes using a lightweight interface, letting me search through all of my notes, and letting me attach files and images to them. And I could even do all of those from my iPhone!
But it’s 2021, and things have changed. I don’t believe Evernote has grown as much as it could have as an app. Between the company trying unsuccessfully to launch over products, and the 2018 layoffs, Evernote is not in the best shape as a company.
But the real nail in the coffin for Evernote in my eyes was the “new” release of Evernote a few months ago. Gone was the feature that would let me export an entire note as HTML. Gone was the feature that would let me export all attachments from a note. These were features I had come to rely on when I needed to get to my data, and Evernote took them out of the product! Furthermore, there was one particularly nasty bug that had me fearing for the state of my notes:
That’s right–after the app being open for a few days, clicking on a note–any note–would show a blank screen where my content would be! The first time I saw this, I went into a near panic. I tried restarting Evernote and that fixed the problem for a few days until it came back. I filed a support request with Evernote about this, yet a few versions later this disturbing bug remains.
So I started looking around, and I found something I fell in love with: Obsidian.
Obsidian can best be described as “An IDE for Markdown documents“. Don’t know what Markdown is? No problem, it’s very simple syntax used to mark up documents (WAY simpler than HTML) that you can learn in no time at all!
And instead of using a database to store notes, Obsidian uses the filesystem to store them. What this means is that backing up your data is as a simple as zipping up the root folder, which Obsidian calls a “Vault”. And your vault can be sitting in a folder shared to Dropbox, OneDrive, any similar service. This separation of concerns makes Obsidian less complicated because it doesn’t have to sync its own notes, that’s a win!
Tarsplit is a utility I wrote which can split UNIX tarfiles into multiple parts while keeping files in the tarballs intact. I specifically wrote those because other ways I found of splitting up tarballs didn’t keep the individual files intact, which would not play nice with Docker.
But what does Docker have to do with tar?
While building the Docker images for my Splunk Lab project, I noticed that one of the layers was something like a Gigabyte in size! While Docker can handle large layers, the issue become one of time it takes to push or pull an image. Docker does validation on each layer as it’s received, and the validation of a 1 GB layer took 20-30 wall clock seconds. Not fun.
It occurred to me that if I could split up that layer into say, 10 layers of 100 Megabytes each, then Docker would transfer about 3 layers in parallel, and while a layer is being validated post-transfer, the next layer would start being transferred. The end result is less wall clock seconds to transfer an entire Docker image.
But there was an issue–Splunk and its applications are big. A few of the tarballs are hundreds of Megabytes in size, and I needed a way to split those tarballs into smaller tarballs without corrupting the files in them. This led to Tarsplit being written.
According to the docs, Eventgen is a Splunk App that lets users built real-time event “generators” so that one-off event generators don’t need to be built.
What does this mean? Let’s say you run a Splunk platform, and you want to create some new dashboards for a data source in production, but want to do this on dev. Without Eventgen, you would need to write a script to generate fake events and write them to a file which is read in by Splunk. That is a lot of work.
And we all have better things to do than write one-off code.
Why Use Eventgen?
With Eventgen, you can create a sample file with say, 1,000 events from that data source, and configure Eventgen to write a random event from that file straight to Splunk via its API, with current timestamps. The end result is that you’ll have a steady stream of realistic events flowing into Splunk with current timestamps, without the need to read from (and rotate) logfiles in the filesystem.
How Eventgen Is Used in Splunk Lab
I took approximately 1,400 lines of logs from my blog’s webserver and included them into Splunk Lab. When Eventgen is used, a random event from that file will be written into Splunk at the rate of approximately once per second. Because the events that make their way into Splunk are random, there will be a short-term fluctuation in the frequency of specific URLs, HTTP verbs, HTTP statuses, etc. This is perfect for creating dashboards that mimic what you might see in a production environment.
So, you want to start a furry convention! Great! We’ll never say no to a new event. And this is a topic I happen to know a little about, as I’ve been staffing/running conventions for 20 years now. (I’m old)
Let’s start with a few questions I’d like to throw out, the sort of questions every new convention organizer should ask themselves…
A Furry Convention Building Checklist:
Have you staffed a con before?
Are you incorporated? If so, as a 501(c)3 or 501(c)7?
Do you have a separate bank account for the organization?
Do you have a budget?
Do you have a hotel or other venue?
Do you have a signed contract with the venue?
Did you read the contract in its entirety?
Do you have liability insurance?
Do you have legal counsel?
Have you lined up people who can be senior staff/department heads for major portions of the convention, such as Programming, A/V, Dealers Room, etc.?
Have you vetted your staff?
Do you have a website?
How about social media?
Is there another con already serving the same general area?
If so, will potential attendees feel that they are forced to choose between the two cons?
Git is a very powerful revision control system used in software development, and at this point, is effectively the industry standard. One of the things that makes Git so powerful is that there are all sorts of low-level operations that developers can do in it. This includes everything from tagging releases, to having an insane number of branches, to painless merging of said branches, to rewriting history.
If you’re new to Git, your brain probably threw an exception when reaching the end of the paragraph as you thought to yourself, “Wait WHAT? Why would you want to REWRITE HISTORY?” Good question! Normally, you don’t want to mess history in a revision control system. But sometimes you may want to or even need to. A few possible reasons for rewriting history include:
Feature Branch “B” is a branch of Feature Branch “A” and you want to merge B’s changes to master, but not A’s.
Squashing all the commits in a feature branch to a single commit, after having previously pushed those commits.
A developer made a commit with something that doesn’t belong in Git, such as PII or credentials.
You want to “clean up” the commit history a little.
Whether any of the above are fully valid reasons or not is something up to you and your dev team.
That said, rebasing is not something you want to experiment with for the first time on your production repo. It’s just a bad idea. So I built a playground/lab where anyone can experiment with git rebasing in an isolated and local environment, which can be stood up in seconds. Here’s how to get started:
git clone email@example.com:dmuth/git-rebase-i-playground.git