painless website backup/synchronization

May 18th, 2007

Why you should care

There are quite a few reasons why you would want to back-up your website. For one thing, in the case of some kind of security breach, you don't want to lose the files on the server. Even if someone broke in, with a backup you could just restore it and you'd be back in a jiff. Otherwise, maybe you just want full control of your files, and knowing that they sit on a server somewhere remote doesn't make you feel as good as knowing they are right on your local disk. Whatever the reason, the following method is well suited to Wordpress sites, but general enough to apply to just about any website.

However, the following method enables you to transfer files in both direction, it's equally ideal for deployment. It makes no difference if you're uploading or downloading, we cover both bases.

How it works

Okay, that was the sales pitch. The script was written to allow for fast deployment of files on a server. Using Wordpress as an example, if you're hacking on your theme and you want to upload that one file you changed and see the result, you can do that quickly and painlessly with rsync. It's really the best way to transfer one file when you know none of the other files have changed. rsync synchronizes two locations, transferring only what has changed.

The files are transferred with rsync over ssh, so you need shell access on the server for this.

In a typical example where you have an account on a web server, this is how your file structure is at the root level (your homedir):

$ ls ~
.bashrc
.htaccess
.ssh
bin/
etc/
mail/
public_ftp/
public_html/
=> cgi-bin/
=> images/
=> => picture.jpg
=> index.html

tmp/

The files in bold are the ones you want to synchronize with your local disk and keep up-to-date. But there will generally be a lot of other files you're not interested in, generated in your homedir automatically, like raw web traffic logs, mail spam etc. (If the item is a directory, you want all the files and dirs it contains to be synchronized.)

So the issue is to selectively pick the items you want. But there may also be certain types of files inside these dirs you don't want, like for instance I ignore cgi-bin. So you want a way to exclude certain files/dirs from being transferred.

How to

Now that you know what's happening, it's time to set it up. You fill in the variables at the top of the script. local_path is where you want the files on disk. remote_path is where they are located on the server (in most cases ~ or /home/username). locations is the list of top level directories/files you want to synchronize. And finally exclusions are patterns you want to exclude (so if it contains cgi-bin, then that directory and all the files in it will be excluded from the synchronization).

Once that's done, you just run

$ sync.sh down

to download the files on the server to your local dir, and

$ sync.sh up

to transfer your local changes to the website. Finally,

$ sync.sh

alone will log you into your server with ssh.

Time to synchronize full local/remote tree for matusiak.eu (5470 files) when no changes were made: 4.4 seconds. ;)

A small note about security

Note that this script does not violate or subvert how you access your server. It uses ssh as the underlying security context. You can easily synchronize up/down with public key authentication, in which case you'll never have to type in your password when running sync.sh, and it's actually more secure as well. :)

#!/bin/bash
#
# Author: Martin Matusiak <numerodix@gmail.com>
# Licensed under the GNU Public License, version 2.


# server setup
hostname="matusiak.eu"
username=""
ssh_port="22"

# local setup
local_path="/local/path"

# remote setup
remote_path="~"
locations="bin backups public_html"

exclusions="cgi-bin *.swp *~" #.swp are vim swap files


## EDIT BELOW THIS LINE IF YOU KNOW WHAT YOU'RE DOING

# rsync options
rsync_options="--archive --verbose --stats --progress"

# switch priority
nice="nice -n 10"


inc_list=""
function inclusion_list() {
	for i in $exclusions; do
		inc_list="${inc_list}--filter='- $i' "
	done
	for i in $locations; do
		inc_list="${inc_list}--filter='+ /$i' "
	done
	inc_list="${inc_list} --filter='- /*'"
}

function shell() {
	 ssh -C ${username}@${hostname} -p ${ssh_port}
}

function sync_up() {
	inclusion_list
	cmd="${nice} rsync ${rsync_options} -e \"ssh -p ${ssh_port}\" \
	${inc_list} \
	${local_path}/* \
	${username}@${hostname}:${remote_path} "
	echo "$cmd"
	sh -c "$cmd"
}

function sync_down() {
	inclusion_list
	mkdir -p ${local_path}
	cmd="${nice} rsync ${rsync_options} -e \"ssh -p ${ssh_port}\" \
	${inc_list} \
	${username}@${hostname}:${remote_path}/* \
	${local_path} "
	echo "$cmd"
	sh -c "$cmd"
}


if [ -z "$1" ]; then
	shell
elif [ "$1" = "down" ]; then
	sync_down
elif [ "$1" = "up" ]; then
	sync_up
else
	echo "$0 [down|up]"	
fi

:: random entries in this category ::

1 Responses to "painless website backup/synchronization"

  1. [...] you’re a heavy ssh user, you already know about scp and rsync+ssh, but even that gets tedious when you’re using the same remote location a [...]