Creating fast incremental backups on a NAS with rsync.

Introduction

This is one of those projects that in retrospect make me realize I should probably do a complete redesign. On the other hand, the current setup has served me perfectly for a couple of years, and watching how a program from 1996 can still deliver makes me happy. So I will just document it anyway.

Image

I was looking for a way to store an incremental backup of my home directory on my Qnap NAS. The Qnap NAS itself offered a couple of standard apps for synchronisation, but these are mainly targeted at Windows systems, use closed source and proprietary protocols and lack the most recent versions. I needed something that I would still understand months after the initial configuration. So I decided to try something basic and see where rsync would take me.

Basic setup

The basic setup is a two-step process.

  1. Implement a mechanism to synchronize my home directory to the NAS.
  2. Have the NAS locally create daily incremental back-ups of that directory.

What I will end up with on the NAS are daily directories with the complete contents of my home directory on that day. Because symbolic links are used, storage requirements are minimal. File contents are only copied when the file changes.

NAS preparation

The Rsync service can be enabled in the “Hybrid Backup Sync” App on the Qnap. Check “Allow other remote Rsync servers to synchronize data to this NAS” and select a username and password. Now I create a new share for my home folder on the mirrored disk. This is done in the “File station” app. After creation, create a new directory “current” on this share.

Image

Now I have a new empty directory /share/home_tj that I can work with. I will write the script on my laptop so it will synchronize to this directory.

Laptop configuration

In its most basic form, the rsync script will look like this:

 File: scripts/backup.sh
rsync \
   --archive \
   --del \
   /home/tj/ \
   rsync@nas.local::home_tj/current

This script needs to be started from the command line, because it will ask for a password. But this is a good start to make sure the basic setup is working and the NAS is receiving the files.

Switches and parameters:

--archive A quick way of saying I want recursion and want to preserve almost everything.
--del I want to delete files from the NAS copy that are no longer present on my laptop.
/home/tj/ The directory on my laptop that I want to send to the NAS.
rsync@nas.local::home_tj/current The directory on the NAS that should receive the copy of the files from my laptop (the DNS name of my NAS is “nas.local”).

When everything is synchronizing as expected, I added the following additional switches:

Additional switches:

-v Verbose output
--out-format="%o %M %f" Format of the verbose output: %o for the operation (“send”, “recv”, or “del.”), %M for the last-modified time of the file and %f for the filename.
-zz To compress the network traffic
--password-file=/home/tj/scripts/etc/pwd_rsync The file pwd_rsync contains the rsync password. Since this is stored as clear text, I created this file as root and changed the privileges so only root can see the contents. This also means the backup.sh script needs to be started with sudo.
--timeout=5 To account for the initial NAS response time, wait 5 seconds before throwing a timeout error.
--bwlimit=2048 Limit bandwidth
--exclude-from=/home/tj/scripts/etc/rsync_exclude File with a specification of the files I do not want to synchronize to the NAS. An example would be the Firefox profile directory.
--delete-excluded Treat files that are excluded as non-existing files, so delete from the NAS.

The command now runs without interaction, so it can be added to a cron or systemd job to run automatically. I chose to run it every 2 minutes.

Exclude file generation

Instead of manually tuning the exclude file, I created some logic to generate the file every time the backup script runs. The script will search for files named .norsync. If a directory contains an empty file with that name, the entire directory is excluded. If the file is not empty, it contains the wildcards of the files to exclude from that directory. I added the following before the start of rsync:

rm /tmp/rsync_exclude 2>/dev/null
find $BASEDIR | grep '\.norsync' | while read FILE
do
   RELATIVEDIR=`dirname ${FILE#$BASEDIR/}`
   if [[ ! -s $FILE ]]
   then
      echo $RELATIVEDIR/\* >> /tmp/rsync_exclude
   else
      LINES=$(cat $FILE)
      for LINE in $LINES
      do
         echo $RELATIVEDIR/$LINE >> /tmp/rsync_exclude
      done
   fi
done

rsync --checksum /tmp/rsync_exclude $BASEDIR/scripts/etc/rsync_exclude
rm /tmp/rsync_exclude

Create increments on the NAS

Now I have a directory /share/home_tj/current on the NAS, with a duplicate of my home directory. I schedule a daily script on the NAS to create incremental backups there (script saved in the /share/home_tj directory):

 File: daily.sh
SHARE=`dirname $0`

cd $SHARE
rsync -a --del --link-dest=../current/ current/ day-`date "+%d"`
touch day-`date "+%d"`

if [[ `date "+%d"` == "01" ]]
then
   rsync -a --del --link-dest=../current/ current/ month-`date "+%m-%b"`
   touch month-`date "+%m-%b"`
fi

if [[ `date "+%d%m"` == "0101" ]]
then
   rsync -a --del --link-dest=../current/ current/ year-`date "+%Y"`
   touch year-`date "+%Y"`
fi

This will create an incremental copy of the last month (day-* directories), the first day of the last 12 months (month-* directories) and the first day of every year (year-* directories).

rsync switches and parameters:

-a A quick way of saying I want recursion and want to preserve almost everything.
--del Delete files that are not in the source directory
--link-dest=../current/ Create a links to the files in the source directory.
current/ Source directory
day-`date "+%d"` Destination directory, containing the current day of the month. If today is the 9th, the directory day-09 is used or created. Similar logic is used for the monthly and yearly backups.

An example to show what happens here: on my laptop, on the 1st of the month, I create a script scripts/backup.sh. This is synchronized from the laptop to NAS as: /share/home_tj/current/scripts. On the 1st and 2nd, the daily script on the NAS runs, now I have 3 files on the NAS:

/share/home_tj/current/scripts/backup.sh
/share/home_tj/day-01/scripts/backup.sh
/share/home_tj/day-02/scripts/backup.sh

The trick is that these filenames all point to the same location on disk. So even though the file is in 3 different directories, it is only physically stored once.

Now I make a change in the script on the 3rd of the month. The file is rsync-ed from the laptop to the NAS. This replaces the link in the current directory with the new file. Now, when the daily job runs on the NAS, a day-04 directory is created with a link to the new script in the current directory.

Qnap crontab

The crontab on the Qnap is a bit special. Normally, on a system with cron installed, the schedule can be edited with crontab -e. This will not fly on the Qnap however. Instead, to edit the scedule:

$> vi /etc/config/crontab

And to implement the edited schedule:

$> crontab /etc/config/crontab
$> /etc/init.d/crond.sh restart

Display file versions

To quickly check which versions of a certain file are on the NAS, I created a small script in the /share/home_tj/ directory:

 File: fileversions.sh
printf "file: "
read FILE

ls -tr */$FILE | while read VERSION
do
   TIMESTAMP=`date -r $VERSION`
   if [[ $TIMESTAMP != "$T" ]]
   then
      ls -l $VERSION
      T="$TIMESTAMP"
   fi
done

Execution of this script looks like this:

[/share/home_tj] # ./fileversions.sh
file: scripts/backup.sh
-rwx------    1 1000     1000           607 Jan  2  2020 year-2020/scripts/backup.sh
-rwxr--r--    1 1000     1000           600 Jul  2  2020 year-2021/scripts/backup.sh
-rwxr--r--    1 1000     1000           747 Jan 23  2021 year-2022/scripts/backup.sh
-rwxr--r--    1 1000     1000           704 Jun 30  2022 month-07-Jul/scripts/backup.sh
-rwxr--r--    1 1000     1000           959 Jul 15  2022 year-2023/scripts/backup.sh
-rwxr--r--    1 1000     1000          1052 Jan 31 18:55 month-02-Feb/scripts/backup.sh
-rwxr--r--    1 1000     1000          1737 Feb 12 15:55 day-12/scripts/backup.sh
-rwxr--r--    1 1000     1000          1663 Feb 13 21:27 current/scripts/backup.sh

System tray icon

As a final touch I would like a system tray icon to show the status of the backup and manually pause the process. For this, I installed Scriptinator. Scriptinator shows a configurable icon on the system tray. It can execute scheduled and manually initiated shell scripts and catches the output. From the Font Awesome project I downloaded a couple of icons and saved them to /home/tj/nixos/scriptinator/:

Image

I added a status file at the start of the backup.sh script, so I can detect if the backup is paused and show the correct icon:

 File: backup.sh
ATHOME=`nmcli | grep tjwifi | wc -l`
if (( $ATHOME == 0 ))
then
   echo "Not at home, exiting"
   touch /home/tj/scripts/var/notathome
   exit 0
else
   rm /home/tj/scripts/var/notathome 2> /dev/null
fi

if [[ -f /home/tj/scripts/var/pausebackup ]]
then
   echo "Backup paused"
   exit 0
fi

The following script is configured as “Periodic script” in Scriptinator, so it will run at regular intervals (set to 10 seconds). This will keep the icon up-to-date with the current status:

 File: backupmon.sh
if [[ -f /home/tj/scripts/var/pausebackup || -f /home/tj/scripts/var/notathome ]]
then
   if [[ -f /home/tj/scripts/var/pausebackup ]]
   then
      IMAGE=pause
      TEXT="Paused manually"
   else
      IMAGE=offline
      TEXT="Paused - not on Braboki"
   fi
else
   TIMER=`systemctl is-active backup-home-tj.timer`
   if [[ $TIMER == active ]]
   then
      IMAGE=`systemctl is-active backup-home-tj.service`
      case $IMAGE in
      failed)
         TEXT=`systemctl status backup-home-tj|grep backup-home-tj-start|cut -f4- -d:`;;
      activating)
         SINCE=`systemctl list-timers | grep backup-home-tj.service | tr -s ' ' | cut -f5 -d\ `
         TEXT="Running since $SINCE";;
      inactive)
         NEXT=`systemctl list-timers | grep backup-home-tj.service | cut -f3 -d\ `
         TEXT="Idle, next: $NEXT";;
      esac
   else
      IMAGE=stop
      TEXT="Stopped - timer service is inactive"
   fi
fi

echo "{PlasmoidIconStart}/home/tj/nixos/scriptinator/$IMAGE.svg{PlasmoidIconEnd}"
echo "{PlasmoidTooltipStart}$TEXT{PlasmoidTooltipEnd}"

The following script is configured as “OnClick” script in Scriptinator, so I can manually pause the process. It creates or deletes the pausebackup status file that is checked in the backup script:

 File: backuppause.sh
if [[ -f /home/tj/scripts/var/pausebackup ]]
then
   rm /home/tj/scripts/var/pausebackup
   STATUS=`systemctl is-active backup-home-tj.service`
else
   touch /home/tj/scripts/var/pausebackup

   STATUS=`systemctl is-active backup-home-tj.service`

   if [[ $STATUS == inactive || $STATUS == failed ]]
   then
      STATUS=pause
   else
      rm /home/tj/scripts/var/pausebackup
   fi

fi

echo "{PlasmoidIconStart}/home/tj/nixos/scriptinator/$STATUS.svg{PlasmoidIconEnd}"
echo "{PlasmoidTooltipStart}$STATUS{PlasmoidTooltipEnd}"

Conclusion

Using rsync, I synchronize my home directory to my NAS. On the NAS, using rsync again, I create a daily incremental backup of the synchronized directory. The process has proved itself to be reliable and stable since its inception. When a backup is needed, I can just browse the NAS, which contains complete versioned daily directories. I use the identical procedure to create backups of my Raspberry Pi Domoticz system.