If you are storing files in Amazon S3, you absolutely positively should enable AWS S3 Access Logging. This will cause every single access in a bucket to be written to a logfile in another S3 bucket, and is super useful for tracking down bucket usage, especially if you have any publicly hosted content in your buckets.
But there’s a problem–AWS goes absolutely bonkers when it comes to writing logs in S3. Multiple files will be written per minute,each with as few as one event in them. It comes out looking like this:
Serverless is an app which lets you deploy applications on AWS and other cloud providers without actually spinning up virtual servers. In our case, we’ll use Serverless to create a Lambda function which executes periodically and performs rollup of logfiles.
So once you have the code, here’s how to deploy it:
cp serverless.yml.exmaple serverless.xml
vim serverless.xml # Vim is Best Editor
serverless deploy # Deploy the app. This will take some time.
If you’re like me, you write a fair bit of a code, which means you have to interact with many Git repositories. If you’re also like me, chances are you have them in a directory called development/ or similar. It might even have some nested directories, something like this:
So that’s cool, but let’s say that you get a new machine and you want replicate your development/ directory structure onto it? One way is to check out everything by hand, but that’s laborious and time consuming. A second way is to keep backups–and you should absolutely do this–but aside from challenges of restoring a single directory out of an entire archive, what if that backup doesn’t have the latest commits in it?
I can now offer a third way. I recently wrote a couple of scripts available on GitHub that can be used to extract Git remote from each repo in an entire directory stucture, and save those remotes and the directories they belong in to a file. Given the above example, it might look something like this: