7 Linux Uniq Command Examples to Remove Duplicate Lines from File

by Himanshu Arora on May 30, 2013

Uniq command is helpful to remove or detect duplicate entries in a file. This tutorial explains few most frequently used uniq command line options that you might find helpful.

The following test file is used in some of the example to understand how uniq command works.

$ cat test
aa
aa
bb
bb
bb
xx

1. Basic Usage

Syntax:

$ uniq [-options]

For example, when uniq command is run without any option, it removes duplicate lines and displays unique lines as shown below.

$ uniq test
aa
bb
xx

2. Count Number of Occurrences using -c option

This option is to count occurrence of lines in file.

$ uniq -c test
      2 aa
      3 bb
      1 xx

3. Print only Duplicate Lines using -d option

This option is to print only duplicate repeated lines in file. As you see below, this didn’t display the line “xx”, as it is not duplicate in the test file.

$ uniq -d test
aa
bb

The above example displayed all the duplicate lines, but only once. But, this -D option will print all duplicate lines in file. For example, line “aa” was there twice in the test file, so the following uniq command displayed the line “aa” twice in this output.

$ uniq -D test
aa
aa
bb
bb
bb

4. Print only Unique Lines using -u option

This option is to print only unique lines in file.

$ uniq -u test
xx

If you like to delete duplicate lines from a file using certain pattern, you can use sed delete command.

5. Limit Comparison to ‘N’ characters using -w option

This option restricts comparison to first specified ‘N’ characters only. For this example, use the following test2 input file.

$ cat test2
hi Linux
hi LinuxU
hi LinuxUnix
hi Unix

The following uniq command using option ‘w’ is compares the first 8 characters of lines in file, and then using ‘c’ option prints number of occurrences of lines of file.

$ uniq -c -w 8 testNew
  3 hi Linux
  1 hi Unix

The following uniq command using option ‘w’ is compares first 8 characters of lines in file, and then using ‘D’ option prints all duplicate lines of file.

$ uniq -D -w 8 testNew
hi Linux
hi LinuxU
hi LinuxUnix

6. Avoid Comparing first ‘N’ Characters using -s option

This option skips comparison of first specified ‘N’ characters. For this example, use the following test3 input file.

$ cat test3
aabb
xxbb
bbc
bbd

The following uniq command using option ‘s’ skips comparing first 2 characters of lines in file, and then using ‘D’ option prints all duplicate lines of file.

Here, starting 2 characters i.e. ‘aa’ in 1st line and ‘’xx’ in 2nd line would not be compared and then next 2 characters ‘bb’ in both lines are same so would be shown as duplicated lines.

$ uniq -D -s 2 test3
aabb
xxbb

7. Avoid Comparing first ‘N’ Fields using -f option

This option skips comparison of first specified ‘N’ fields of lines in file.

$ cat test2
hi hello Linux
hi friend Linux
hi hello LinuxUnix

The following uniq command using option ‘f’ skips comparing first 2 fields of lines in file, and then using ‘D’ option prints all duplicate lines of file.

Here, starting 2 fields i.e. ‘hi hello’ in 1st line and ‘hi friend’ in 2nd line would not be compared and then next field ‘Linux’ in both lines are same so would be shown as duplicated lines.

$ uniq -D -f 2 test2
hi hello Linux
hi friend Linux

Linux Sysadmin Course Linux provides several powerful administrative tools and utilities which will help you to manage your systems effectively. If you don’t know what these tools are and how to use them, you could be spending lot of time trying to perform even the basic administrative tasks. The focus of this course is to help you understand system administration tools, which will help you to become an effective Linux system administrator.
Get the Linux Sysadmin Course Now!

If you enjoyed this article, you might also like..

  1. 50 Linux Sysadmin Tutorials
  2. 50 Most Frequently Used Linux Commands (With Examples)
  3. Top 25 Best Linux Performance Monitoring and Debugging Tools
  4. Mommy, I found it! – 15 Practical Linux Find Command Examples
  5. Linux 101 Hacks 2nd Edition eBook Linux 101 Hacks Book

Bash 101 Hacks Book Sed and Awk 101 Hacks Book Nagios Core 3 Book Vim 101 Hacks Book

{ 11 comments… read them below or add one }

1 Don Dailey May 30, 2013 at 8:29 am

For these commands to work the original file must be properly sorted. People not familiar with this tool are going to be confused by that unless it is specifically stated. In your example file they are sorted, but there is explicit mention of that.

From the man page:

Note: ‘uniq’ does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use `sort -u’ without `uniq’. Also, comparisons honor the rules specified by `LC_COLLATE’.

2 Bob May 30, 2013 at 9:48 am

Good article.
One point you want to say is that it only works if the duplicate lines are next to one another.

So I always have to sort the file then pass it to uniq like this

cat filename | sort | uniq

Without the sort, a file that has this will not work as expected

aa
bb
aa
bb

3 Júlio Hoffimann Mendes May 30, 2013 at 10:02 am

Very interesting options.

Thanks.

4 Jalal Hajigholamali May 30, 2013 at 2:38 pm

Hi,

Thanks a lot,,,

5 whizid May 30, 2013 at 9:58 pm

thanks for this wonderful and powerful tip..

6 Javier E. Pérez P. May 31, 2013 at 2:40 pm

there is a way to avoid the use of “sort” if i had:

$ cat extensions.txt
jpg
png
jpg
gif

and only need to retrieve one ocurrence? (if it don’t duplicate, show it, and if repeated… only show once) actually i use

$ sort extensions.txt |uniq
gif
jpg
png

7 Chris F.A. Johnson May 31, 2013 at 7:56 pm

To remove non-adjacent duplicate lines without re-ordering the file, use awk:

awk ‘!x[$0]++’ “$file”

8 lau June 1, 2013 at 5:23 am

thanks..

9 shweta June 8, 2013 at 3:31 am

Really helpful………….. :) thanx

10 VIVEK July 30, 2013 at 8:36 pm

Ur tutorials r quite useful fr beginners lyk me!!Thanks a lot!!

11 Gayatri November 27, 2013 at 5:10 am

Awesome ! I Iike your all articles .. Very Useful…. Thank You…………………….

Leave a Comment

Previous post:

Next post: