Lzma Vs Bzip2 – Better Compression than bzip2 on UNIX / Linux

by Balakrishnan Mariyappan on June 4, 2010

Lzma stands for Lempel-Ziv-Markov chain Algorithm. Lzma is a compression tool like bzip2 and gzip to compress and decompress files. It tends to be significantly faster and efficient than bzip compression. As we know, gzip compression ratio is worse than bzip2 (and lzma).

In this article, let us understand how to use lzma, an effective compression utility which is significantly better in compression ratio and faster operation.

Compress the input text file using lzma -c

$ lzma  -c --stdout  sample.txt  >sample.lzma

Decompress the lzma file using -d option

$ lzma -d –stdout sample.lzma >sample.txt

Comparison between bzip2 and lzma compression tools

To understand the effectiveness of lzma, let us compress/decompress a 1MB sample.txt with both lzma and bzip2 and compare the outcome. These testing has been done with the machine which has 1GB of RAM  and the processor  is Pentium 4.

Size of the sample.txt input file:

$ ls -l sample.txt
-rw-r--r-- 1 bala bala   1048576 2010-05-14 19:43 sample.txt

Note: We used time command in front of every compression and decompression commands to get the CPU usage of the command.

Compress the sample.txt using bzip2

Compress the input file with bzip2 command and it doesnt require the option during compression.

$ time bzip2  sample.txt

real    0m27.874s
user    0m13.981s
sys     0m0.148s

$ ls -l sample.txt.bz2
-rw-r--r-- 1 bala bala      1750 2010-05-14 19:43 sample.txt.bz2

After bzip2 compression, the output file size is of 1750 bytes.

Decompress the sample.txt using bunzip2

Decompress the compressed file with bunzip2 utility and it also doesn’t need any option to be passed.

$ bunzip2  sample.txt.bz2

real    0m0.232s
user    0m0.128s
sys     0m0.020s

Compress the sample.txt using lzma

Now, let us compress the sample.txt using lzma command with the following options:

  • -c to compress
  • –stdout to print the compressed output in stdout
$ time lzma  -c --stdout  sample.txt >sample.lzma

real    0m2.035s
user    0m1.544s
sys     0m0.132s

$ ls -l sample.lzma
-rw-r--r-- 1 bala bala       543 2010-05-14 19:48 sample.lzma

After the compression, lzma produces the output file with the size as 543 bytes, which is comparatively less than bzip2 command. Also, as seen above, the CPU time used by lzma is much less than the bzip2.

Decompress the sample.txt using lzma

Decompress the *.lzma file using the lzma command with following options:

  • -d to compress
  • –stdout to print the decompressed output in stdout
$ time lzma -d --stdout sample.lzma >sample.txt

real    0m0.043s
user    0m0.016s
sys     0m0.004s

As seen above, the decompression done by lzma is many times quicker than bzip2

Different Levels of Lzma Compression

  • Lzma provides the compression range from -1 to -9.
  • -9 is the highest compression ratio, which requires certain amount of time and system resources to do it. These ratio are not applicable for decompression.
  • -1 is the lowest level compression ratio and it runs much quicker.

Do the following to do a quick lzma compression using the low level compression ratio:

$ lzma -1 -c --stdout  sample.txt >sample.lzma

$ ls -l sample.lzma

-rw-r--r-- 1 bala bala       548 2010-05-14 20:47 sample.lzma

Note: -fast is alias to -1.

-9 is the highest level compression ratio and it takes longer time to compress than the low level ratio. Do the following to do a intensive compression using the high level compression ratio:

$ lzma -9 -c --stdout  sample.txt >sample.lzma

$ ls -l sample.lzma
-rw-r--r-- 1 bala bala       543 2010-05-14 20:55 sample.lzma

Note: -best is alias to -9.


Linux Sysadmin Course Linux provides several powerful administrative tools and utilities which will help you to manage your systems effectively. If you don’t know what these tools are and how to use them, you could be spending lot of time trying to perform even the basic administrative tasks. The focus of this course is to help you understand system administration tools, which will help you to become an effective Linux system administrator.
Get the Linux Sysadmin Course Now!

If you enjoyed this article, you might also like..

  1. 50 Linux Sysadmin Tutorials
  2. 50 Most Frequently Used Linux Commands (With Examples)
  3. Top 25 Best Linux Performance Monitoring and Debugging Tools
  4. Mommy, I found it! – 15 Practical Linux Find Command Examples
  5. Linux 101 Hacks 2nd Edition eBook Linux 101 Hacks Book

Bash 101 Hacks Book Sed and Awk 101 Hacks Book Nagios Core 3 Book Vim 101 Hacks Book

{ 14 comments… read them below or add one }

1 machielo June 4, 2010 at 2:44 am

Hi! Maybe you would like to compare to the XZ compression (which is based on LZMA):
http://tukaani.org/xz/

Here is a short comparative:
http://tukaani.org/lzma/benchmarks.html

As you can see, if you don’t mind about the time it takes to compress, you can get the best ratios with XZ. Also, the decompression time is lower than that of bzip2 and it’s supported by the tar archiver ;)

2 Ikon June 4, 2010 at 3:00 am

Thank you for this valuable guide! Before reading it I always compressed my pdfs and text based documents, backups with bzip2. Now I’m completely swicthing for sure :)

3 thib June 4, 2010 at 5:10 am

Note: LZMA utils[1] are deprecated in favor of XZ utils[2]. Maybe your system aliased them.

1: http://tukaani.org/lzma/
2: http://tukaani.org/xz/

4 329692195 June 4, 2010 at 6:27 am

I also learned a useful command,it’s very usefull!Thank you.

5 roque June 4, 2010 at 6:36 am

how to compress a directory?

6 Ikon June 4, 2010 at 8:26 am

rogue:

first you have to tar it then lzma on the tar, simple :) just az with bzip2

7 Chris F.A. Johnson June 4, 2010 at 9:14 am

In my experience, bzip2 is faster and often produces smaller files than lzma:

$ ls -l 100_0758.mov*
-rwxr-xr-x 1 chris chris 10246430 14-Apr-1903 05:49:04 100_0758.mov
$ time lzma -9 100_0758.mov

real 0m5.418s
user 0m5.159s
sys 0m0.219s
$ ls -l 100_0758.mov*
-rwxr-xr-x 1 chris chris 10094080 14-Apr-1903 05:49:04 100_0758.mov.lzma
$ lzma -d 100_0758.mov*
$ time bzip2 -9 100_0758.mov

real 0m4.486s
user 0m4.412s
sys 0m0.038s
$ ls -l 100_0758.mov*
-rwxr-xr-x 1 chris chris 10025989 14-Apr-1903 05:49:04 100_0758.mov.bz2

With a text file, lzma does produce smaller files, but it still takes longer than bzip2.

8 Ikon June 4, 2010 at 9:18 am

Chris:

bzip and lzma are both destined for compressing text data, not binary. They both produce better results when not used for binary data then rar for example. Keep that in mind. So if you run your benchmarks now on for example a pdf file, which contains much text, then you will see the difference.

9 Chris F.A. Johnson June 4, 2010 at 9:45 am

They are both designed for all types of files.

The lzma man page states, “lzma provides notably better compression ratio than bzip2 especially with files having other than plain text content.”

10 Basilio Briceño June 4, 2010 at 10:35 am

Very intersting, thanks for this article. I will try lzma more often.

11 Desidia June 9, 2010 at 8:27 am

There are other things than the ratio of compression to consider. bzip2 has the advantage over gzip of making his work by “blocks”; if a compressed file was damaged or partially transmitted on the net, remaining blocks were unaffected and could be recovered (see the man page).

And with lzma ?

12 Anonymous July 27, 2010 at 4:57 pm

I think you compressed an empty file. That is hardly a comprehensive benchmark, or anything that really shows LZMA can be better than bzip2… try compressing ~10mb of source code, that will tell others about real life use.

13 Ikon July 28, 2010 at 1:33 am

Anonymous, ok I have compressed a full Zend Framework library with api and reference documentation that weights 16,368 items, totalling 187.5 MB. With bzip2 it’s 21.3 MB and with lzma it’s 18.9 MB which is only 12% smaller.
So still if you need to have it piped through ssh and decpomressed and loaded maybe in the same time I suggest bzip2. When you need it for storage then lzma is perfect.

14 Jagdish B. Hariyani May 3, 2013 at 3:30 am

Hi Sir, This informational tutorial is really very helpful for day-to-day backup utilites on the server which saves space. Thank You,

Leave a Comment

Previous post:

Next post: