TrumanWong

rsync

Remote data synchronization tool

Supplementary instructions

rsync command is a remote data synchronization tool that can quickly synchronize files between multiple hosts through LAN/WAN. rsync uses the so-called "rsync algorithm" to synchronize files between the local and remote hosts. This algorithm only transmits different parts of the two files instead of transmitting the entire copy each time, so it is quite fast. rsync is a very powerful tool, and its commands also have many feature options. We will analyze and explain its options one by one below.

grammar

rsync [OPTION]... SRC DEST
rsync [OPTION]... SRC [USER@]host:DEST
rsync [OPTION]... [USER@]HOST:SRC DEST
rsync [OPTION]... [USER@]HOST::SRC DEST
rsync [OPTION]... SRC [USER@]HOST::DEST
rsync [OPTION]... rsync://[USER@]HOST[:PORT]/SRC [DEST]

Corresponding to the above six command formats, rsync has six different working modes:

  1. Copy local files. This working mode is enabled when neither SRC nor DES path information contains a single colon ":" separator. For example: rsync -a /data /backup
  2. Use a remote shell program (such as rsh, ssh) to copy the contents of the local machine to the remote machine. This mode is enabled when the DST path address contains a single colon ":" separator. For example: rsync -avz *.c foo:src
  3. Use a remote shell program (such as rsh, ssh) to copy the contents of the remote machine to the local machine. This mode is enabled when the SRC address path contains a single colon ":" delimiter. For example: rsync -avz foo:src/bar /data
  4. Copy files from the remote rsync server to the local machine. This mode is enabled when the SRC path information contains the "::" delimiter. For example: rsync -av root@192.168.78.192::www /databack
  5. Copy files from the local machine to the remote rsync server. This mode is enabled when the DST path information contains the "::" delimiter. For example: rsync -av /databack root@192.168.78.192::www
  6. List the files on the remote machine. This is similar to rsync transfer, but just omit the local machine information in the command. For example: rsync -v rsync://192.168.78.192/www

Options

-v, --verbose Verbose mode output.
-q, --quiet condensed output mode.
-c, --checksum Turn on the verification switch to force verification of file transfers.
-a, --archive archive mode, which means to transfer files recursively and keep all file attributes, equal to -rlptgoD.
-r, --recursive Process subdirectories recursively.
-R, --relative Use relative path information.
-b, --backup Create a backup, that is, when the same file name already exists for the destination, rename the old file to ~filename. You can use the --suffix option to specify a different backup file prefix.
--backup-dir Store backup files (such as ~filename) in the directory.
-suffix=SUFFIX defines the backup file prefix.
-u, --update only update, that is, skip all files that already exist in DST and have files later than the ones to be backed up, and do not overwrite updated files.
-l, --links Preserve soft links.
-L, --copy-links Treat soft links like regular files.
--copy-unsafe-links only copies links pointing outside the SRC path directory tree.
--safe-links Ignore links pointing outside the SRC path directory tree.
-H, --hard-links Preserve hard links.
-p, --perms Preserve file permissions.
-o, --owner Keep file owner information.
-g, --group Keep file group information.
-D, --devices Keep device file information.
-t, --times Keep file time information.
-S, --sparse perform special processing on sparse files to save DST space.
-n, --dry-run Show which files will be transferred.
-w, --whole-file copies files without incremental detection.
-x, --one-file-system Do not cross file system boundaries.
-B, --block-size=SIZE Block size used by the verification algorithm. The default is 700 bytes.
-e, --rsh=command specifies the use of rsh or ssh for data synchronization.
--rsync-path=PATH specifies the path information of the rsync command on the remote server.
-C, --cvs-exclude Use the same method as CVS to automatically ignore files to exclude files that you do not want to transfer.
--existing Only updates files that already exist in DST, without backing up newly created files.
--delete deletes files in DST that are not in SRC.
--delete-excluded also deletes files on the receiving end that are excluded by this option.
--delete-after delete after the transfer is completed.
--ignore-errors will also delete IO errors when they occur.
--max-delete=NUM Delete at most NUM files.
--partial retains files that were not completely transferred for some reason, thus speeding up subsequent retransmissions.
--force Force deletion of the directory, even if it is not empty.
--numeric-ids Do not match numeric user and group IDs to user and group names.
--timeout=time ip timeout time, unit is seconds.
-I, --ignore-times Do not skip files with the same time and length.
--size-only When deciding whether to back up a file, only look at the file size without considering the file age.
--modify-window=NUM The timestamp window used to determine whether files have the same time. The default is 0.
-T --temp-dir=DIR Create temporary files in DIR.
--compare-dest=DIR Also compares files in DIR to determine whether backup is needed.
-P is equivalent to --partial.
--progress displays backup progress.
-z, --compress Compress backup files during transfer.
--exclude=PATTERN Specifies to exclude file patterns that do not need to be transferred.
--include=PATTERN Specifies file patterns that need to be transferred without excluding them.
--exclude-from=FILE Exclude files with the specified pattern in FILE.
--include-from=FILE Do not exclude files matching the pattern specified by FILE.
--version Print version information.
--address Bind to a specific address.
--config=FILE specifies other configuration files and does not use the default rsyncd.conf file.
--port=PORT specifies other rsync service ports.
--blocking-io Use blocking IO for remote shells.
-stats gives the transfer status of certain files.
--progress Show transfer progress while transferring.
--log-format=formAT specifies the log file format.
--password-file=FILE Get password from FILE.
--bwlimit=KBPS Limit I/O bandwidth, KBytes per second.
-h, --help Display help information.

Example

SSH mode

First start the ssh service on the server:

service sshd start
Start sshd: [OK]

Synchronization using rsync

Next, you can use the rsync command on the client to back up the data on the server. The SSH method is used to back up the system user, as follows:

rsync -vzrtopg --progress -e ssh --delete work@172.16.78.192:/www/* /databack/experiment/rsync
work@172.16.78.192's password:
receiving file list...
5 files to consider
test/
a
0 100% 0.00kB/s 527:35:41 (1, 20.0% of 5)
b
67 100% 65.43kB/s 0:00:00 (2, 40.0% of 5)
c
0 100% 0.00kB/s 527:35:41 (3, 60.0% of 5)
dd
100663296 100% 42.22MB/s 0:00:02 (4, 80.0% of 5)
sent 96 bytes received 98190 bytes 11563.06 bytes/sec
total size is 100663363 speedup is 1024.19

The above information describes the entire backup process and the total size of the backup data.

Backend service mode

Start the rsync service, edit the /etc/xinetd.d/rsync file, change disable=yes to disable=no, and restart the xinetd service, as follows:

vi /etc/xinetd.d/rsync

#default: off
# description: The rsync server is a good addition to an ftp server, as it \
# allows crc checksumming etc.
servicersync{
disable=no
socket_type = stream
wait=no
user=root
server = /usr/bin/rsync
server_args = --daemon
log_on_failure += USERID
}
/etc/init.d/xinetd restart
Stop xinetd: [OK]
Start xinetd: [OK]

Create a configuration file. After the rsync program is installed by default, the main configuration file of rsync will not be automatically created. It needs to be created manually. The main configuration file is "/etc/rsyncd.conf". Create the file and insert the following content:

vi /etc/rsyncd.conf

uid=root
gid=root
max connections=4
log file=/var/log/rsyncd.log
pid file=/var/run/rsyncd.pid
lock file=/var/run/rsyncd.lock
secrets file=/etc/rsyncd.passwd
hosts deny=172.16.78.0/22

[www]
comment= backup web
path=/www
read only=no
exclude=test
auth users=work

Create a password file. In this way, the system user cannot be used to authenticate the client, so you need to create a password file in the format of "username:password". The username and password can be defined arbitrarily. It is best not to be consistent with the system account. , and at the same time set the permissions of the created password file to 600, which is described in detail in the previous module parameters.

echo "work:abc123" > /etc/rsyncd.passwd
chmod 600 /etc/rsyncd.passwd

Backup, after completing the above work, you can now back up the data, as follows:

rsync -avz --progress --delete work@172.16.78.192::www /databack/experiment/rsync

Password:
receiving file list ...
6 files to consider
./ files...
a
0 100% 0.00kB/s 528:20:41 (1, 50.0% of 6)
b
67 100% 65.43kB/s 0:00:00 (2, 66.7% of 6)
c
0 100% 0.00kB/s 528:20:41 (3, 83.3% of 6)
dd
100663296 100% 37.49MB/s 0:00:02 (4, 100.0% of 6)
sent 172 bytes received 98276 bytes 17899.64 bytes/sec
total size is 150995011 speedup is 1533.75

Recovery, when there is a problem with the server's data, then the server needs to be restored through the client's data, but the premise is that the server allows the client to have write permission, otherwise the server cannot be restored directly on the client. , the method of using rsync to recover data is as follows:

rsync -avz --progress /databack/experiment/rsync/ work@172.16.78.192::www

Password:
building file list...
6 files to consider
./
a
b
67 100% 0.00kB/s 0:00:00 (2, 66.7% of 6)
c
sent 258 bytes received 76 bytes 95.43 bytes/sec
total size is 150995011 speedup is 452080.87

Synchronize source directory to target directory

$ rsync -r source destination

In the above command, -r means recursion, that is, including subdirectories. Note that -r is required, otherwise rsync will not run successfully. The source directory represents the source directory, and destination represents the target directory.

Multiple files or directories synchronized

$ rsync -r source1 source2 destination

In the above command, source1 and source2 will be synchronized to the destination directory.

Sync meta information

The -a parameter can replace -r. In addition to recursive synchronization, meta information (such as modification time, permissions, etc.) can also be synchronized. Since rsync uses file size and modification time by default to determine whether a file needs to be updated, -a is more useful than -r. The following usage is the common way of writing.

$ rsync -a source destination

If the destination directory destination does not exist, rsync will automatically create it. After executing the above command, the source directory source is completely copied to the destination directory destination, forming the directory structure of destination/source.

If you only want to synchronize the contents of the source directory source to the target directory destination, you need to add a slash after the source directory.

$ rsync -a source/destination

After the above command is executed, the contents of the source directory will be copied to the destination directory, and a source subdirectory will not be created under destination.

Results of simulation execution

If you are not sure what the results of rsync will be after execution, you can first use the -n or --dry-run parameter to simulate the results of the execution.

$ rsync -anv source/destination

In the above command, the -n parameter simulates the result of command execution and does not actually execute the command. The -v parameter outputs the results to the terminal so that you can see what content will be synchronized.

The target directory becomes a mirror copy of the source directory

By default, rsync simply ensures that all contents of the source directory (except explicitly excluded files) are copied to the target directory. It doesn't leave the two directories the same, and it doesn't delete the files. If you want to make the target directory a mirror copy of the source directory, you must use the --delete parameter, which will delete files that only exist in the target directory and not in the source directory.

$ rsync -av --delete source/destination

In the above command, the --delete parameter will make destination become a mirror of source.

Exclude files

Sometimes, we want to exclude certain files or directories during synchronization. In this case, we can use the --exclude parameter to specify the exclusion mode.

$ rsync -av --exclude='*.txt' source/destination
# or
$ rsync -av --exclude '*.txt' source/destination

The above command excludes all TXT files.

Note that rsync will synchronize hidden files starting with "dot". If you want to exclude hidden files, you can write --exclude=".*" like this.

If you want to exclude all files in a directory, but do not want to exclude the directory itself, you can write it as follows.

$ rsync -av --exclude 'dir1/*' source/destination

For multiple exclusion modes, multiple --exclude parameters can be used.

$ rsync -av --exclude 'file1.txt' --exclude 'dir1/*' source/destination

Multiple exclude patterns can also take advantage of Bash's wide-expansion feature, using just a single --exclude argument.

$ rsync -av --exclude={'file1.txt','dir1/*'} source/ destination

If you have many exclude patterns, you can write them to a file, one line per pattern, and then specify this file with the --exclude-from parameter.

$ rsync -av --exclude-from='exclude-file.txt' source/destination

Specify file modes that must be synchronized

The --include parameter is used to specify the file mode that must be synchronized, and is often used in combination with --exclude.

$ rsync -av --include="*.txt" --exclude='*' source/destination

The above command specifies that when synchronizing, all files will be excluded, but TXT files will be included.