bzip2

Compress files into bz2 format

Supplementary instructions

bzip2 command is used to create and manage (including decompress) compressed packages in ".bz2" format.

bzip2 uses the Burrows-Wheeler block sort text compression algorithm and Huffman encoding to compress files. The compression rate is generally much better than LZ77/LZ78-based compression software, and its performance is close to the PPM family of statistical compression software.

The command line arguments are intentionally designed to be very close to the form of GNU gzip, but not exactly the same.

bzip2 reads in file names and parameters from the command line. Each file is replaced by a compressed file named "original filename.bz2". Each compressed file has the same modification time, permissions, and if possible the same owner as the original file, so these properties will be correctly restored when decompressing. In some file systems, there is no concept of permissions, ownership or time, or there are strict restrictions on the length of file names, such as MSDOS. In this case, bzip2 has no mechanism to maintain the original file name, owner, permissions and time. , in this sense, bzip2's handling of file names is naive.

bzip2 and bunzip2 do not overwrite existing files by default. If you want to overwrite an existing file, specify the -f option.

If no filename is specified, bzip2 will compress the data from standard input and write it to standard output. In this case, bzip2 will refuse to write the compression result to the terminal, because this is completely incomprehensible and makes no sense.

bunzip2 (and bzip2 -d) decompresses all specified files. Files not generated by bzip2 are ignored and a warning message is issued. bzip2 determines the decompressed file name from the compressed file name as follows:

filename.bz2 is decompressed into filename
filename.bz is decompressed into filename
filename.tbz2 is decompressed into filename.tar
filename.tbz is decompressed into filename.tar
anyothername is decompressed into anyothername.out

If the filename's suffix is not one of the following: .bz2, .bz, .tbz2 or .tbz, .bzip2 will complain that the original filename cannot be determined and use the original filename plus .out as the decompressed filename.

When compressing, if no file name is provided, bzip2 will read data from standard input and write the compression results to standard output.

bzip2 uses a 32-bit CRC check code to check itself to confirm that the decompressed file is the same as the original file. This can be used to detect if the compressed file is corrupted and prevent unknown bugs in bzip2 (which are very unlikely with any luck). The chance of undetected data corruption is extremely small, approximately 1 in 4 billion for each file processed. The check is done while decompressing, so it just means something is wrong somewhere. It can help restore the original uncompressed data. You can use bzip2recover to try to recover data from corrupted files.

Return value: 0 for normal exit, 1 for environmental problems (file not found, illegal options, I/O errors, etc.), return 2 to indicate that the compressed file is damaged, and there is an internal consistency error (such as a defect) that causes bzip2 to exit urgently. Returns 3.

grammar

bzip2 [ -cdfkqstvzVL123456789 ] [ filenames ... ]

Options

-c --stdout
     # Compress or decompress data to standard output.

-d --decompress
     # Force decompression. bzip2, bunzip2 and bzcat are actually the same program, and what operations are performed will be determined by the program name. Specifying this option will override this mechanism and force bzip2 to decompress.

-z --compress
     # Supplement to the -d option: forces compression regardless of which program is executed.

-t --test
     # Check the integrity of the specified file, but do not decompress it. An experimental decompression operation will actually be performed on the data without outputting the results.

-f --force
     # Force overwriting of output files. Normally bzip2 will not overwrite already existing files. This option also forces bzip2 to break hard links on files, which bzip2 does not do by default.

-k --keep
     # Preserve input files when compressing or decompressing (do not delete them).

-s --small
     # Reduce memory usage during compression, decompression and inspection. Compressed and tested using a modified algorithm that requires only 2.5 bytes per data block. This means any file can be under 2300k
     # decompression in memory, albeit at half the usual speed.

     # When compressing, -s will select a block length of 200k, and the memory usage will be limited to about 200k, at the expense of a lower compression rate. In summary, if the machine has less memory (8 megabytes or less),
     # The -s option can be used for all operations. See Memory Management below.

-q --quiet
     # Suppress unimportant warning messages. Messages related to I/O errors and other critical events will not be suppressed.

-v --verbose
     # Exhaustive mode -- displays the compression ratio of each file being processed. More -v options on the command line will increase the level of verbosity, causing bzip2 to display a lot of information primarily for diagnostic purposes.

-L --license -V --version
     # Display software version, license terms and conditions.

-1 to -9
     # Set block length to 100 k, 200 k .. 900 k when compressing. Has no effect on decompression. See Memory Management below.

-- # Treat all subsequent command line variables as file names, even if they begin with a minus sign "-". This option can be used to process file names starting with a minus sign "-", for example: bzip2 -- -myfilename.

--repetitive-fast --repetitive-best
     # These options are redundant in 0.9.5 and above. In earlier versions, these two options provided some coarse control over the behavior of the sorting algorithm, which was useful in some cases. 0.9.5
     # Versions and above use an improved algorithm regardless of these options.

Parameters

File: Specify the file to compress.

Example

Compress the specified file filename:

bzip2 filename
or
bzip2 -z filename

Here, there will be no output during compression. The original file filename will be deleted and replaced with filename.bz2. If there was filename.bz2 before, it will not be replaced and an error will be prompted (if you want to replace, specify the -f option, for example bzip2 -f filename; If filename is a directory, it will also remind you of the error and do not perform any operation; if filename has already been compressed and has a bz2 suffix, it will remind you that it will no longer be compressed. If there is no bz2 suffix, it will be compressed again.

Extract the specified file filename.bz2:

bzip2 -d filename.bz2
or
bunzip2 filename.bz2

Here, there is no standard output when decompressing, and the original file filename.bz2 will be replaced with filename. If filename exists before, it will not be replaced and an error will be prompted (if you want to replace, specify the -f option, such as bzip2 -df filename.bz2.

The results will also be output when compressing and decompressing:

$bzip2 -v filename

After input, the output is as follows:

filename: 0.119:1, 67.200 bits/byte, -740.00% saved, 5 in, 42 out.

Here, adding the -v option will output the output. Only compression is used as an example. When decompressing, the same is true for bzip2 -dv filename.bz2, which is no longer used as an example.

Simulated decompression does not actually decompress:

bzip2 -tv filename.bz2

After input, the output is as follows:

filename.bz2: ok

Here, -t specifies to simulate decompression without actually generating results, that is to say, it is similar to checking files. Of course, even if there is filename under the directory, there will be no error output, because it will not actually decompress the file at all. In order to output on the screen, the -v option is added here. If bzip2 -dv filename.bz2 is really decompressed, the output will replace "ok" with "done".

When compressing and decompressing, in addition to generating the result file, the original file is also saved:

bzip2 -k filename

Here, add -k to save the original file, otherwise the original file will be replaced by the result file. Only compression is used as an example. When decompressing, $bzip2 -dk filename.bz2 is no longer used in the same way.

Extract to standard output:

bzip2 -dc filename.bz2

After input, the output is as follows:

hahahhaahahha

Here, use -c to specify the standard output. The output is the content of the file filename, and filename.bz2 will not be deleted.

Compressed to standard output:

bzip2 -c filename
bzip2: I won't write compressed data to a terminal.
bzip2: For help, type: `bzip2 --help'.

Here, use -c to specify compression to standard output without deleting the original file. The difference is that the compressed file cannot be output to standard output.

**When using bzip2, treat everything following as a file (even if the file name starts with '-'): **

bzip2 -- -myfilename

This is mainly to prevent - in the file name from causing ambiguity in thinking it is an option.