If you are storing files in Amazon S3, you absolutely positively should enable AWS S3 Access Logging. This will cause every single access in a bucket to be written to a logfile in another S3 bucket, and is super useful for tracking down bucket usage, especially if you have any publicly hosted content in your buckets.
But there’s a problem–AWS goes absolutely bonkers when it comes to writing logs in S3. Multiple files will be written per minute,each with as few as one event in them. It comes out looking like this:
Serverless is an app which lets you deploy applications on AWS and other cloud providers without actually spinning up virtual servers. In our case, we’ll use Serverless to create a Lambda function which executes periodically and performs rollup of logfiles.
So once you have the code, here’s how to deploy it:
cp serverless.yml.exmaple serverless.xml
vim serverless.xml # Vim is Best Editor
serverless deploy # Deploy the app. This will take some time.
I’m a big fan of Amazon S3 for storage, and I like it so much that I use Odrive to sync folders from my hard drive into S3 use S3 to store copies of all of my files from Dropbox as a form of backup. I only have about 20 GB of data that I truly care about, so that should be less than a dollar per month for hosting, right? Well…
Close to 250 GB billed for last month. How did that happen?
I’m a big fan of the Discord Musicbot, and run it on some Discord servers that I admin. Wanting to run it on a server, I first created an Ansible playbook and launched a server on Digital Ocean. But after a few months, I noticed that the server was sitting over 90% idle. Surely there had to be a better way.
So I next tried Docker, and created a Dockerized version of the Musicbot. I was quite happy with how much easier it was to spin up the bot, but still didn’t want to run it on a dedicated server on Digital Ocean. Aside from having unused capacity, if that machine were to go down, I’d have to do something manually.
I thought about running it in some sort of hosted Docker cluster, and came across Amazon’s container service. So this post is about creating your own cluster in ECS and hosting a Docker container in it. I found the process slightly confusing when I did it the first time, and wanted to share my experience here.
While S3 is a great storage platform, what happens if you accidentally delete some important files? Well, S3 has a mechanism to recover deleted files, and I’d like to go into that in this post.
First, make sure you have versioning enabled on your bucket. This can be done via the API, or via the UI in the “properties” tab for your bucket. Versioning saves every change to a file (including deletions) as a separate version of said object, with the most recent version taking precedence. In fact, a deletion is also a version! It is a zero-byte version which has a “DELETE” flag set. And the essence of recovering undeleted files simply involves removing the latest version with the “DELETE” flag.
This is what that would look like in the UI:
To undelete these files, we’ll use a script I created called s3-undelete.sh, which can be found over on GitHub:
Earlier tonight, I had the pleasure of attending a presentation from Chris Munns of Amazon at the offices of First Round Capital about scaling your software on AWS past the first 10 million users. I already had some experience with AWS, but I learned quite a few new things about how to leverage AWS, so I decided to write up my notes in a blog post for future reference, and as a service to other members of the Philadelphia tech community.
Without further preamble, here are my notes from the presentation: