How to Calculate CRC Checksum in Linux using Cksum Command

by Himanshu Arora on July 13, 2012

Checksum is used for verifying the integrity of the data. Suppose some file is being copied over a network or over a system and due to some event like network connection loss or sudden reboot of machine the data did not get copied completely.

Now, how would you verify the integrity of data? Well, its through the CRC checksum mechanism the data integrity can be verified. There are various mechanisms through which a CRC checksum can be calculated. For example in one of our articles (IP header check sum) we discussed how to find the checksum of an IP header. In this article, we will focus on the Linux ‘cksum’ command which is used to calculate the check sum of files or the data provided on standard input.

What is CRC?

CRC stands for cyclic redundancy check.

Checksum can be calculated by applying cyclic redundancy check (CRC) mechanism over the data that is being communicated.  Each block of data that is traveling the communication channel is attached with a CRC code or checksum and when the data block reaches the destination, this check is applied again to generate a checksum value. If the checksum generated at the destination and the checksum value in the data block are same then data is believed to be non-corrupted and can be used further but if the two checksum values are not same  then in that case data is said to be corrupted or infected.

The name CRC is because:

  • This mechanism is based on the fundamentals of cyclic codes (hence cyclic).
  • The code attached with the data as checksum is redundant ie it adds no value to the data being transferred (hence redundancy).
  • Its a check (hence check)

The cksum command

The cksum command is used for computing the cyclic redundancy check (CRC) for each file provided to it as argument. CRC becomes important in situations where data integrity needs to be verified. Using the cksum command, one can compare the checksum of destination file with that of the source file to conclude that whether the data transfer was successful or not.

Besides providing the CRC value, this command also produces the file size and file name in the output. The command exits with status zero in case of success and any other status value indicates failure.

One can get a detailed information on this command by typing the following on the command prompt :

$ info coreutils 'cksum invocation'

cksum command examples

1. A basic example

On a very basic level, the cksum command can be used to display the checksum for a file.

$ cksum testfile.txt
3000792507 3 testfile.txt

The first value (big number) in the output above is the checksum for the file, then we have the size of the file and finally the name of the file.

2. Checksum changes with change in content

The test file ‘testfile.txt’ has following contents:

$ cat testfile.txt
Hi

To calculate the checksum of the test file, pass it as argument to the cksum command :

$ cksum testfile.txt
3000792507 3 testfile.txt

Now, Modify the contents of file :

$ cat testfile.txt
Hi everybody.

Again pass the test file as argument to cksum command :

$ cksum testfile.txt
2559130041 14 testfile.txt

So we see that with change in contents, the checksum changes.

3. Change in content does not always mean increase or decrease in size

Well the above is true fundamentally also and even for chksum too. Lets see what it means :

Check the contents of the test file ‘testfile.txt’ :

$ cat testfile.txt
Hi everybody

Note the checksum :

$ cksum testfile.txt
2559130041 14 testfile.txt

Now, change the content by not actually adding or deleting something but by replacing one character with other so that size of the file remains same.

$ cat testfile.txt
Hi everybudy.

So as you can see, I replaced ‘o’ with ‘u’.

Compare the checksum now:

$ cksum testfile.txt
3252191934 14 testfile.txt

So we see that the checksum changed even if the change was of one character replaced by other.

4. An interrupted copy

Suppose you are copying a zipped folder containing various sub-folders and files from one location to another and due to any reason whatsoever the copy process got interrupted, so how would you check whether everything was copied successfully or not? Well, cksum makes it possible as now we know that in case of the partial copy, the overall checksum of the destination would differ from that of the source folder.

You can simulate this scenario in the following way:

I created Linux.tar.gz and Linux_1.tar.gz from the same ‘Linux’ folder. The difference being that Linux_1.tar.gz was made when ‘Linux’ folder contained an extra text file.

So the above scenario simulates when Linux_1.tar.gz was being copied but got interrupted when just one text file was left to be copied in the target Linux.tar.gz

Now when I compare the checksum of both these files, I see

$ cksum Linux.tar.gz
756656601 1037079 Linux.tar.gz

$ cksum Linux_1.tar.gz
2598429125 1037184 Linux_1.tar.gz

So the above output shows different checksum values suggesting incorrect copy of file.

5. Checksum of standard output

This command provides a feature where-in the user can type just ‘cksum’ or ‘cksum-’ and write on stdin and then press Ctrl+D couple of times. This way cksum gives the checksum of the data entered at the input.

$ cksum
Lets check the checksum1135634677 23

In the example above, we actually calculated the checksum of the string “Lets check the checksum”.


Linux Sysadmin Course Linux provides several powerful administrative tools and utilities which will help you to manage your systems effectively. If you don’t know what these tools are and how to use them, you could be spending lot of time trying to perform even the basic administrative tasks. The focus of this course is to help you understand system administration tools, which will help you to become an effective Linux system administrator.
Get the Linux Sysadmin Course Now!

If you enjoyed this article, you might also like..

  1. 50 Linux Sysadmin Tutorials
  2. 50 Most Frequently Used Linux Commands (With Examples)
  3. Top 25 Best Linux Performance Monitoring and Debugging Tools
  4. Mommy, I found it! – 15 Practical Linux Find Command Examples
  5. Linux 101 Hacks 2nd Edition eBook Linux 101 Hacks Book

Bash 101 Hacks Book Sed and Awk 101 Hacks Book Nagios Core 3 Book Vim 101 Hacks Book

{ 9 comments… read them below or add one }

1 Rajesh Kumar V July 13, 2012 at 9:11 am

Really excellant article.now I know cksum.thanks a lot for this…

2 bob July 13, 2012 at 10:28 am

Thanks!!! Great article…

3 HAK July 13, 2012 at 11:40 am

Nice article! Now i am no more confused about CRC. Thanks.

4 Jalal Hajigholamali July 13, 2012 at 9:42 pm

Hi,

Thanks a lot..

Very nice and useful article, i sent it to my students in university….

5 steve July 15, 2012 at 8:49 pm

Hello,

Great article thanks.

However, I think md5sum is much better than the cksum as the output value is wider in length.

6 PB July 17, 2012 at 3:14 am

Checksum and CRC are very different and distinct things. Checksums are used (among other places) in IP transfers while CRCs are often used in internal file integrity checks. Checksum is a simple addition of all the bytes (or words or longs etc.) of a file together and is not infallible. CRC is much more sophisticated and as a result, much more compute intensive and much more reliable at detecting transmition errors. So, which is it? Does cksum produce a checksum or a CRC? I suspect the former as CRC needs more input paramaters in the versions I am familiar with. RFM.

7 Ehan Chang March 8, 2013 at 1:08 am

helpfull, but could explain CRC in more detail in the future article.

8 uzzal October 2, 2013 at 6:42 am

Thank you.. It’s helpful…

9 TJa October 18, 2013 at 3:46 am

@PB
No, CRC is just a way to calculate a checksum.

The most simple way to calculate a checksum is by exclusive-oring (adding without carry) all bytes. The next step is by adding them all up. Both methods sometimes use a preset value or a complementing algorithm for mathematical reliability reasons.

A more sophisticated calculation is a CRC (cyclic redundancy check). If done cleverly it is almost as fast as adding up the bytes. CRC is a well defined and CCITT standardised method so there can be no disambiguation. CRC’s are available in 16, 32 and 64 bits. If we want to make things even more reliable we use MD5 or SHA1 checksums ranging from 128 bytes up to several kilobytes. We tend to not call these ‘checksums’ but ‘hash values’ but basically they are still checksums.

There is no difference between ‘on disk’ CRC checksums and CRC checksums used in transmission systems.

Leave a Comment

Previous post:

Next post: