Unix Sed Tutorial: Advanced Sed Substitution Examples

by Sasikala on October 26, 2009

Linux Sed Examples - Advanced Find and Replace OperationThis article is part of the on-going Unix Sed Tips and Tricks series.

In our previous sed articles we learned — sed printing, sed deletion, sed substitute , sed file write, and sed multiple commands.

In this article, let us review some interesting workarounds with the “s” substitute command in sed with several practical examples.

I. Sed Substitution Delimiter

As we discussed in our previous post, we can use the different delimiters such as @ % | ; : in sed substitute command.

Let us first create path.txt file that will be used in all the examples mentioned below.

$ cat path.txt
/usr/kbos/bin:/usr/local/bin:/usr/jbin:/usr/bin:/usr/sas/bin
/usr/local/sbin:/sbin:/bin/:/usr/sbin:/usr/bin:/opt/omni/bin:
/opt/omni/lbin:/opt/omni/sbin:/root/bin

Example 1 – sed @ delimiter: Substitute /opt/omni/lbin to /opt/tools/bin

When you substitute a path name which has ‘/’, you can use @ as a delimiter instead of ‘/’. In the sed example below, in the last line of the input file, /opt/omni/lbin was changed to /opt/tools/bin.

$ sed 's@/opt/omni/lbin@/opt/tools/bin@g' path.txt
/usr/kbos/bin:/usr/local/bin:/usr/jbin/:/usr/bin:/usr/sas/bin
/usr/local/sbin:/sbin:/bin/:/usr/sbin:/usr/bin:/opt/omni/bin:
/opt/tools/bin:/opt/omni/sbin:/root/bin

Example 2 – sed / delimiter: Substitute /opt/omni/lbin to /opt/tools/bin

When you should use ‘/’ in path name related substitution, you have to escape ‘/’ in the substitution data as shown below. In this sed example, the delimiter ‘/’ was escaped in the REGEXP and REPLACEMENT part.

$ sed 's/\/opt\/omni\/lbin/\/opt\/tools\/bin/g' path.txt
/usr/kbos/bin:/usr/local/bin:/usr/jbin/:/usr/bin:/usr/sas/bin
/usr/local/sbin:/sbin:/bin/:/usr/sbin:/usr/bin:/opt/omni/bin:
/opt/tools/bin:/opt/omni/sbin:/root/bin

II. Sed ‘&’ Get Matched String

The precise part of an input line on which the Regular Expression matches is represented by &, which can then be used in the replacement part.

Example 1 – sed & Usage: Substitute /usr/bin/ to /usr/bin/local

$ sed 's@/usr/bin@&/local@g' path.txt
/usr/kbos/bin:/usr/local/bin:/usr/jbin/:/usr/bin/local:/usr/sas/bin
/usr/local/sbin:/sbin:/bin/:/usr/sbin:/usr/bin/local:/opt/omni/bin:
/opt/omni/lbin:/opt/omni/sbin:/root/bin

In the above example ‘&’ in the replacement part will replace with /usr/bin which is matched pattern and add it with /local. So in the output all the occurrance of /usr/bin will be replaced with /usr/bin/local

Example 2 – sed & Usage: Match the whole line

& replaces whatever matches with the given REGEXP.

$ sed 's@^.*$@<<<&>>>@g' path.txt
<<</usr/kbos/bin:/usr/local/bin:/usr/jbin/:/usr/bin:/usr/sas/bin>>>
<<</usr/local/sbin:/sbin:/bin/:/usr/sbin:/usr/bin:/opt/omni/bin:>>>
<<</opt/omni/lbin:/opt/omni/sbin:/root/bin>>>

In the above example regexp has “^.*$” which matches the whole line. Replacement part <<<&>>> writes the whole line with <<< and >>> in the beginning and end of the line respectively.

III. Grouping and Back-references in Sed

Grouping can be used in sed like normal regular expression. A group is opened with “\(” and closed with “\)”.Grouping can be used in combination with back-referencing.

Back-reference is the re-use of a part of a Regular Expression selected by grouping. Back-references in sed can be used in both a Regular Expression and in the replacement part of the substitute command.

Example 1: Get only the first path in each line

$ sed 's/\(\/[^:]*\).*/\1/g' path.txt
/usr/kbos/bin
/usr/local/sbin
/opt/omni/lbin

In the above example, \(\/[^:]*\) matches the path available before first : comes. \1 replaces the first matched group.

Example 2: Multigrouping

In the file path.txt change the order of field in the last line of the file.

$ sed '$s@\([^:]*\):\([^:]*\):\([^:]*\)@\3:\2:\1@g' path.txt
/usr/kbos/bin:/usr/local/bin:/usr/jbin:/usr/bin:/usr/sas/bin
/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/omni/bin:
/root/bin:/opt/omni/sbin:/opt/omni/lbin

In the above command $ specifies substitution to happen only for the last line.Output shows that the order of the path values in the last line has been reversed.

Example 3: Get the list of usernames in /etc/passwd file

This sed example displays only the first field from the /etc/passwd file.

$sed 's/\([^:]*\).*/\1/' /etc/passwd
root
bin
daemon
adm
lp
sync
shutdown

Example 4: Parenthesize first character of each word

This sed example prints the first character of every word in paranthesis.

$ echo "Welcome To The Geek Stuff" | sed 's/\(\b[A-Z]\)/\(\1\)/g'
(W)elcome (T)o (T)he (G)eek (S)tuff

Example 5: Commify the simple number.

Let us create file called numbers which has list of numbers. The below sed command example is used to commify the numbers till thousands.

$ cat  numbers
1234
12121
3434
123

$sed 's/\(^\|[^0-9.]\)\([0-9]\+\)\([0-9]\{3\}\)/\1\2,\3/g' numbers
1,234
12,121
3,434
123
Download Free eBook - Linux 101 Hacks

Get free Unix tutorials, tips and tricks straight to your email in-box.

If you enjoyed this article, you might also like..

  1. Unix Sed Tutorial: Find and Replace Text Inside a File Using RegEx
  2. Unix Sed Tutorial: 6 Examples for Sed Branching Operation
  3. Unix Sed Tutorial: Append, Insert, Replace, and Count File Lines
  4. Unix Sed Tutorial : 7 Examples for Sed Hold and Pattern Buffer Operations
  5. Unix Sed Tutorial: Multi-Line File Operation with 6 Practical Examples
  

Vim 101 Hacks Book

{ 9 comments… read them below or add one }

1 Berry November 2, 2009 at 11:23 pm

Dear author,
In the example 4, what is the meaning of \b escape character? and why the REGEXP can finish the task?

2 Sasikala November 3, 2009 at 5:31 am

@Berry,

\b matches a word boundary.

If you omit \b from example 4, it just parenthesize all the Uppercase letters.

$ echo “Welcome To The Geek StuffPost” | sed ‘s/\([A-Z]\)/\(\1\)/g’
(W)elcome (T)o (T)he (G)eek (S)tuff(P)ost

3 Berry November 7, 2009 at 1:38 am

@Sasikala,

Oh, I got it. Thank you very much. ^_^

Following your tips, I found that / can match the right. Thank you again!

4 Berry November 7, 2009 at 1:41 am

@Sasikala,

Following your tips, I found that / plus “smaller than” can match the left boundary and / plus “larger than” can match the right. Thank you again!

5 Guru Prasad February 27, 2010 at 1:37 am

Hi,
Regarding the example 4, when i try it in Red Hat linux, it works, but when i try in HP-UX or SunOS it does not. Any reason why so?

6 Guru Prasad February 27, 2010 at 1:40 am

Also, same is the case with example 5 as well…please let me know why

7 Anjum April 9, 2010 at 11:17 am

Example 5 may only work on Linux (I don’t see how) but does not work for other systems. The following snippet does work on Sun OS and am sure will work on most systems out there; including Linux:

echo ’1234
12121
3434
123′ | sed ‘s/\([0-9]\{3\}\)$/,\1/g;s/^,//’

Thanks,
-Anjum.

8 Anjum April 9, 2010 at 4:10 pm

Example 4 reworked and tested on Solaris 9:

echo “Welcome To The Geek Stuff” | sed ‘s/\<\([A-Z]\)/\(\1\)/g'

Thanks,
-Anjum.

9 Logan August 10, 2010 at 4:35 pm

A complete example of commify (add comma as thousands separator)

$ cat numbers.txt
1
12
123
1234
12345
123456
1234567
12345678
123456789
1234567890
1234567890.1234
+1234567890.1234
-1234567890.1234

With the “-r” option, there is no need to escape the paraenthesis and curly brackets.

The simple code below doesn’t work for numeric strings larger that 6.

$ sed -r ‘s/(^|[^0-9.])([0-9]+)([0-9]{3})/\1\2,\3/g’ numbers.txt
1
12
123
1,234
12,345
123,456
1234,567
12345,678
123456,789
1234567,890
1234567,890.1234
+1234567,890.1234
-1234567,890.1234

This one works. (taken from question 4.14 of http://go.to/sed-faq)
$ sed -r ‘:a;s/(^|[^0-9.])([0-9]+)([0-9]{3})/\1\2,\3/g;ta’ numbers.txt
1
12
123
1,234
12,345
123,456
1,234,567
12,345,678
123,456,789
1,234,567,890
1,234,567,890.1234
+1,234,567,890.1234
-1,234,567,890.1234

Leave a Comment

Previous post:

Next post: