Using rsync for Data Synchronisation

There are many use cases where you will need to transfer data into, out of, or within CloudCIX. We recommend using rsync for this purpose. Rsync is a synchronisation tool and in this tutorial, we will see that while rsync can be used to copy, it is far more versatile than copy utilities such as SCP. As well as being more powerful, it is faster, more secure and better supported than many other copy utilities including SCP.

Rsync was developed by Andrew Tridgell and Paul Mackerras as a Ph.D. research project and it was released in 1996.

Rsync uses TCP Port 22 for communication by default. This is the same port used for SSH. If there is already an SSH shell open to a remote host, you can use rsync to that device without creating any new firewall rule.

Click the link below to watch an our data synchronisation video for this tutorial…

Data Synchronsation Training Video

Rsync Benefits

  1. Rsync is a fast because it compresses data in transit.

  2. Rsync is an extraordinarily versatile file copying tool with many options.

  3. Rsync can copy locally, to or from another host over any remote shell, or to or from a re-mote rsync daemon. In this tutorial we will not cover the daemon implementation of rsync.

  4. Rsync offers many options that control every aspect of its behaviour and permit very flexi-ble specification of the set of files to be copied.

  5. Rsync is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination.

  6. Rsync is widely used for backups and mirroring and as an improved copy command for everyday use. It is ideal for the automation of such processes via the implementation of cron jobs.

  7. If Rsync is interrupted during a copy or a synchronisation process, it can be restarted and continue from where it previously stopped.

Use Cases

rsync copies from local to local, from local to remote and from remote to local.

One of the original use cases envisaged was to upload a website to a server. For a website in continuous development only the changes need to be uploaded. This is why the program is described as a synchronisation program rather than just a copy program.

Note

The option -a is the most commonly used option in -a or –archive*. This option implements a number of other options including recurcive directory copying and it ensures that file permissions are maintained.

Warning

Always use the –dry-run option on untested commands. This will help prevent data loss by verifying that you are coping the right data from the correct source to the correct destination.