Managing e-statements for fun and profit

Unless you’ve been living under a rock for the last 5 years, you should have noticed the imminent extinction of paper statements from most of the consumer services. Whether we like it or not, e-bills are the way of the future.

One problem with e-bills is that they still need to be stored – preferably, in a way that makes them easy to find. One also cannot rely on the service owners to archive old bills, as they rarely keep them for longer than 1 year.

A typical process managing such bills involves multiple steps:

  • Download the new bill from the website of the service provider.
  • Rename the PDF file from an exciting automatic (e.g 8B96766CBC0A49E6A45C2B039623C56C.pdf) to a boring one (e.g. TRowePrice-2017-Q3.pdf)
  • Store it so that one could find it when needed (ideally, not in a single folder holding thousands of such files).

About 3 years ago, I’ve written a relatively simple tool that attempts to automate the latter two of those 3 steps (automatically downloading is notoriously difficult, given different website organization for each provider and the fact that they love to tweak the website layout, frequently breaking the scraping procedure).

The tool is called classify_bills and is available on GitHub. When run on a bunch on PDF files, it tries to figure out what account each file belongs to and what time period it is for. It then renames it appropriately and moves it into an appropriate destination.

The tool is configuration-driven, it basically runs off a bunch of regular expressions (hat tip to jwz), and configuring a new account is strictly speaking, not pleasant. Still, once configured, it eliminates the chore of laboriously doing those actions manually, bill by bill, month by month, making the process faster and pleasanter.

Using Raspberry Pi to monitor and water plants

Introduction

I’ve been growing hot peppers on my balcony and one of the things I constantly fail to keep on schedule is watering them. As it is with many chores, this is something simple yet not exciting enough to avoid being actively procrastinated on.

As such, I only saw several solutions to that problem:

  • Become better at avoiding procrastination (yeah, ri-i-i-ight!)
  • Outsource the activity to some hapless family member.
  • Get rid of the peppers.
  • Throw technology (and time, and money) at the problem.

Given all those options, the choice was rather simple. As a result, I built a computer-driven watering station that takes care of my peppers for me. At this point, my involvement is only needed to periodically add water to the water tank (about once every 2-3 weeks). Apart from that, the system is fully self-driven.

In its current incarnation, the system handles 4 pots (number of pots is limited by current hardware – in theory it should be possible to add more relays and moisture sensors). It periodically measures moisture level of the soil in each of the pots. When it decides the soil is “dry enough” it waters the plant. It also makes sure watering has the desired effect (i.e. preventing conditions like tank running empty or watering tube being misplaced). Finally, it posts all kinds of stats on a dashboard. (not yet, actually).

Continue reading “Using Raspberry Pi to monitor and water plants”

Rotating between different fonts in Emacs

There are two types of people: those who set on one font/color theme for the rest of their life and those like me, who constantly tweak their setup. Not that it helps productivity, but hey – RMS created .emacs for a reason!

Since typing in font names over and over again is a pain, I’ve created a neat function that can be called interactively to rotate between various programming-friendly fonts installed in the system. The list of fonts is partly mine, partly borrowed from Programming Fonts  (which is awesome on its own). This allows a very quick way to see how one’s code looks with different fonts (and make a final decision for the next 3 weeks or so).

Continue reading “Rotating between different fonts in Emacs”

Creating reproducible ZIP archives

An Artist A DayI am currently working on an app that loads data in relatively small incremental payloads from the server side for offline use. Payload files are in ZIP format and the way the app knows whether it has the correct file (which might not be true – the file might be missing, partially downloaded or simply changed by developer) is by checking it’s checksum against the local file. If there is a checksum mismatch, the file is (re)downloaded.

During development, I regenerate the payload files many times and when the time comes, I release the “golden” versions into production. Obviously, I insist on reproducible builds and this includes the bundle files – if the bits of the data that went into those files haven’t changed, there should be absolutely no reason why the generated bundle should differ in a single bit. After all if the file has a new signature, the app will download it again (even if the data is the same). I could devise a more elaborate way via some kind of versioning but it would be more complex, more error-prone, and wouldn’t add much benefit to the user. Plus, it’s nice to know that your data is guaranteed to be the same.

Image credit: New Line Cinema.
Image credit: New Line Cinema.

Turns out, things are not so easy when it comes to ZIP files. Both OS X and Ubuntu offer command-line zip program which by default sticks gobs of metadata into the ZIP archive, including timestamps. Thus, no two zips are alike (well, maybe they will be if run within the same second, but I didn’t bother to check).

Unfortunately there is no way to tell zip to only include most critical metadata (file names and sizes) and ignore everything else. It does offer -X (aka -no-extra) option to strip some attributes but this doesn’t include the file timestamps.

As a result, the only way that seems to be generating consistently reproducible ZIP archives from the same set of files is to force the modification time on files to some well-known time and zip them without any extra attributes. More specifically:

touch -t 201212210314.16 . myfiles.*
zip -X -q myarchive.zip myfiles.*

This works consistently (and portably) on OS X and in Linux but of course, there is no guarantee it will stay the same: new zip version might slightly change the algorithm (hopefully not, considering how pervasive ZIP is). But  at least for now, this seems to be the only way to perform reproducible, portable ZIP archive generation.