Automated off-site backups with borgmatic
Maintaining an off-site backup copy of a system is almost always a good idea. The scope of the backup could be just critical user data that simply must not be lost, or the entire root file system, for those times you lost at commmand line russian roulette. About a year ago, I started doing full system backups using a program named BorgBackup (short: Borg), a process which was much simpler than I had anticipated. Not much later I set up scheduled automated backups with another tool, borgmatic, which builds on top of Borg. In this post I’ll talk a bit about Borg and go through how to perform backup tasks through both Borg and borgmatic.
So what makes Borg different from other backup software out there? Possibly nothing! I couldn’t tell you because it’s the first piece of backup software I tried setting up myself. I can however tell you of some of the features of Borg that caught my interest, these being:
- Deduplication
- Encryption
- Agentless remotes
Deduplication⌗
Borg performs deduplication (ensuring identical data is not stored twice) automatically. It does this by dividing backuped files into chunks in a deterministic way and taking note of the hash of every chunk. So when the file kitten.jpeg
is backuped for the first time, Borg splits the file into N chunks, calculates a hash for every chunk, and stores those chunks and hashes. The next time that file or an identical file is backuped, Borg will notice (by looking at at its collection of hashes) that the chunks already exist in the repository (Borgs term for the backup location) and proceeds by not backing up those chunks again. This has multiple advantages:
- If
kitten.jpeg
exists in multiple different locations on disk when performing a full system backup, the occupied space in the backup repository will not be larger than if there had only been onekitten.jpeg
. - If data inside
kitten.jpeg
changes between backup invocations, only a subset of the chunkskitten.jpeg
consists of in Borgs eyes will change. Only those chunks will be subjected to being copied to the backup repository. - If almost no files have changed between two backup invocations, almost no file data will need to be transferred to the backup repository. This translates into less network traffic and a speedier backup.
Encryption⌗
Borg allows for clientside encryption of all backuped data. To accomplish this, a passhprase protected encryption key is generated clientside when a backup repository is created. This key is then stored in a locked state on the repository side. Whenever any type of backup operation is to be performed, the client requests the key from the repository, unlocks it with the passphrase (which should only ever reside clientside) and proceeds from there. Any chunks being sent from the client to the repository will have already been encrypted on the client side.
Agentless remotes⌗
When performing backups against an off-site remote, it is not necessary for that remote to have Borg installed. The developers promise performance gains if that should be the case, but simply having SSH access to the remote server will suffice in order for Borg to carry out its work.
Quick start⌗
To start off, we need to install Borg. Borg is available in the default repositories of many Linux distributions. If you are on Mac, you can use brew
for installing it. Consult the official documentation for more detailed instructions on how to install on your OS.
Once you have Borg installed, we’ll initialize a repository and backup a single file. We’ll create the repository in a local directory for now.
# Here you will be prompted for a passphrase to protect your key
borg init --encryption=repokey ~/my_first_repo
echo "Some contents" > file_to_be_backuped.txt
# This backups file_to_be_backuped.txt to a new archive "my_first_archive" in
# the previously created repository
borg create ~/my_first_repo::my_first_archive file_to_be_backuped.txt
Easy huh? You can now list the contents of the repository with borg list ~/my_first_repo::my_first_archive
to inspect its contents. If you want to extract the files from the backup, run borg extract ~/my_first_repo::my_first_archive
.
Automated backups by means of script⌗
You now know how Borg can be used to create repositories and backup files to archives within them. Performing full off-site backups isn’t a whole lot different. There’s a full example provided over at the official documentation. It’s basically the same thing as what we just did, with a couple of changes and additions:
- Instead of initializing a local repository, it initializes a remote one instead.
- It has
borg create
target (most) of the root file system instead of just a single file - It adds a
borg prune
command. This ensures we have no more than a manageable number of backups at any given time by deleting all archives not matching some age criteria.
You can actually just use this script, modify it for your own use case, trigger it with cron or a systemd timer at some regular interval and you’ll have your backups all set up! I did this for a while until I started using borgmatic.
borgmatic⌗
borgmatic differentiates itself from Borg by largely being configuration-driven. This means that, instead of writing a script to perform your backups, you write a configuration file and run borgmatic
. Let’s start off by installing borgmatic. As with Borg, this is well-documented for many different OSes and I would suggest that you follow the instructions at the documentation.
Once we have borgmatic installed, we can go ahead and start configuring by creating a /etc/borgmatic/config.yaml
file and ensuring it has appropriate permissions:
touch /etc/borgmatic/config.yaml
chmod 644 /etc/borgmatic/config.yaml
We do not want the configuration to be world-readable as it will contain our passphrase. Next, we can fill it with some contents:
location:
source_directories:
- /etc
- /home
- /root
- /var
repositories:
# Use your own repository here
- k5dptkf7@k5dptkf7.repo.borgbase.com:repo
storage:
# Use your own securely generated passhprase here
encryption_passphrase: swimwear-atonable-juncture-frequent
retention:
keep_daily: 7
keep_weekly: 4
keep_monthly: 6
This is a very minimal configuration but it gets the job done. Upon running borgmatic
, the most critical parts of the system: /etc/
, /home/
, /root
and /var
, will be backuped to a remote repository k5dptkf7@k5dptkf7.repo.borgbase.com:repo
. The retention settings also makes sure old backup archives are pruned so that we only keep daily backups for the last 7 days, weekly backups for the last 4 weeks and monthly backups for the last 6 months.
Running borgmatic regularly⌗
So the configuration file we just talked about takes care of configuring borgmatic, what about running it? Ideally we would want the borgmatic
command to run once every day without any manual intervention from the user. Luckily, borgmatic ships with both a systemd service and timer unit. You can take a look at the service unit at /usr/lib/systemd/system/borgmatic.service
. It can look a bit daunting but it mostly consists of options that sandboxes the borgmatic process in such a way that it’s only permitted to do what it needs to do, perform backups. The timer unit /usr/lib/systemd/system/borgmatic.timer
is considerably easier on the eyes:
[Unit]
Description=Run borgmatic backup
[Timer]
OnCalendar=daily
Persistent=true
[Install]
WantedBy=timers.target
We can see that it is configured to run once daily and run persistently, meaning: if it would have been triggered once for example while the system was powered off, it will immediately trigger once the system is started again. Let’s go ahead and start and enable it!
systemctl enable --now borgmatic.timer
And there you have it! You’ve now set up automated daily off-site backups with borgmatic. Congratulations. Hopefully you’ll sleep a tiny bit better knowing all of those kitten.jpeg
files of yours are stored away safely in an off-site backup.
Wait, how do I set up a remote? Which remote should I use?⌗
If it was something I glanced over in this post, it was this. There are multiple different options here. If you have your own remote server somewhere with the storage needed available, you can use that as your backup remote. Simply make sure the user running the borgmatic
command (root in the previous example) has SSH access to that machine.
I have for some time been using BorgBase as my remote. Their small plan includes 100 GB of storage at a rate of $2 per month, which covers my use case well enough. I have no affiliation with BorgBase. If you wish to sponsor the borgmatic project, you can use this referral link if you decide to sign up.
In both the case of having your own remote backup server and using BorgBase, you should use SSH keys to facilitate the connection. Both because the use of an SSH key is safer than using and password and since borgmatic
will be running non-interactively, there won’t be anyone around to input the password.