by Ramesh Natarajan on June 29, 2011

Awk is a powerful language to manipulate and process text files. It is especially helpful when the lines in a text files are in a record format. i.e A record containing multiple fields separated by a delimiter. Even when the input file is not in a record format, you can still use awk to do some basic file and data processing. You can also write programming logic using awk even when there are no input files that needs to be processed.

In short, AWK is a powerful language, that can come in handy to do daily routine jobs.

If you are new to awk, start by reading this Awk introduction tutorial that is part of the Awk tutorial series.

Learning curve on AWK is much smaller than the learning curve on any other languages. If you know C program already, you’ll appreciate how simple and easy it is to learn AWK.

AWK was originally written by three developers — A. Aho, B. W. Kernighan and P. Weinberger. So, the name AWK came from the initials of those three developers.

The following are the three variations of AWK:

1. Awk

AWK is original AWK written by A. Aho, B. W. Kernighan and P. Weinberger.

2. Nawk

NAWK stands for “New AWK”. This is AT&T’s version of the Awk.

3. Gawk

GAWK stands for “GNU AWK”. All Linux distributions comes with GAWK. This is fully compatible with AWK and NAWK.

On Linux, typing either awk or gawk invokes the GAWK. awk is linked to gawk as shown below on Linux systems.

# ls -l /bin/awk /usr/bin/awk
lrwxrwxrwx 1 root root  4 Jan  5 23:13 /bin/awk -> gawk
lrwxrwxrwx 1 root root 14 Jan  5 23:13 /usr/bin/awk -> ../../bin/gawk

The following table summarizes the different features that are available in these versions. As you see below, gawk is the superset that contains all the features of original awk and nawk.

Awk Vs Nawk Vs Gawk

Download the Awk Vs Nawk Vs Gawk differences in PDF cheatsheet format.

The following basic built-in variables FS, OFS, RS, ORS, NR, NF, and FILENAME are available in all versions of awk.

Feature Description AWK NAWK GAWK
FS Input field separator Yes Yes Yes
OFS Output field separator Yes Yes Yes
RS Record separator Yes Yes Yes
ORS Output record separator Yes Yes Yes
NR Number of the record Yes Yes Yes
NF Number of fields in a record Yes Yes Yes
FILENAME Contains current input-file that is getting processed Yes Yes Yes

All the following features are not available in the original awk. They are available in nawk and/or gawk as shown below.

Feature Description NAWK GAWK
FNR File “Number of the record” Yes Yes
ARGC Total number or arguments passed to awk script Yes Yes
ARGV Array containing all awk script arguments Yes Yes
ARGIND Index to ARGV to retrieve the current file name Yes
SUBSEP Subscript separator for array indexes Yes Yes
RSTART Match function sets RSTART with the starting location of str1 in str2 Yes Yes
RLENGTH Match function sets RLENGTH with length of the str1 Yes Yes
OFMT Awk uses this to decide how to print values. Default is “%.6g” Yes Yes
ENVIRON Array containing all environment variables and values Yes
IGNORECASE Default is 0. When set to 1, it is case insensitive for string and reg-ex comparisons. Yes
ERRNO Contains error message of an I/O operation. e.g. while using getline function. Yes
BINMODE n Set binary mode for I/O. n can be 1 (input files), 2(output files), or 3(all files) Yes
CONVFMT The format used while converting number to string. Yes
FIELDWIDTHS n n is a space delimited number that indicates the column widths. If this is available, gawk uses this instead of FS. Yes
LINT n n can be a number. When n is a nonzero number (indicating true), gawk will displays fatal, invalid, or warning lint messages (same as –lint command line) Yes
TEXTDOMAIN This is used for internationalization. Yes
sub(str1,str2,var) In the input string (var), str1 is replaced with str2, and output is stored back in var Yes Yes
gsub(str1,str2,var) Same as sub, but global. It does multiple substitutions on the same input string (var). Yes Yes
match(str1,str2) Returns positive number when str1 is present in str2. Yes Yes
getline < file Read next line from another input-file. Sets $0, NF Yes Yes
getline var < file Read next line from another input-file and store it in variable (var) Yes Yes
toupper(str) Converts str to upper-case Yes
tolower(str) Converts str to lower-case Yes
|& Two way communication between awk command and external process Yes
systime() Current time in epoch time. Combine with strftime. e.g. print strftime(“%c”,systime()) Yes

Linux Sysadmin Course Linux provides several powerful administrative tools and utilities which will help you to manage your systems effectively. If you don’t know what these tools are and how to use them, you could be spending lot of time trying to perform even the basic administrative tasks. The focus of this course is to help you understand system administration tools, which will help you to become an effective Linux system administrator.
Get the Linux Sysadmin Course Now!

If you enjoyed this article, you might also like..

  1. 50 Linux Sysadmin Tutorials
  2. 50 Most Frequently Used Linux Commands (With Examples)
  3. Top 25 Best Linux Performance Monitoring and Debugging Tools
  4. Mommy, I found it! – 15 Practical Linux Find Command Examples
  5. Linux 101 Hacks 2nd Edition eBook Linux 101 Hacks Book

Bash 101 Hacks Book Sed and Awk 101 Hacks Book Nagios Core 3 Book Vim 101 Hacks Book

{ 2 comments… read them below or add one }

1 Kerry Hoath June 29, 2011 at 3:56 am

No mention is made of mawk in your article. it is smaller and fastger than gawk but has limits on nf and sprintf buffer size.

2 Mike Kupfer October 16, 2012 at 6:59 pm

awk, gawk, and nawk have varying support for regular expressions. IIUC, awk supports grep regular expressions, while gawk supports egrep regular expressions. I’m not sure how nawk fits in here.

Leave a Comment

Previous post:

Next post: