Unix Sed Tutorial: 6 Examples for Sed Branching Operation

by Sasikala on December 23, 2009

Linux Sed Examples for Branching OperationsThis article is part of the on-going Unix Sed Tips and Tricks series.

Like any other programming language, sed also provides special branching commands to control the flow of the program.

In this article, let us review following two types of Sed branching.

  1. Sed Unconditional Branch
  2. Sed Conditional Branch

Sed Unconditional Branch Syntax:

$ sed ':label command(s) b label'
  • :label - specification of label.
  • commands - Any sed command(s)
  • label - Any Name for the label
  • b label – jumps to the label with out checking any conditions. If label is not specified, then jumps to the end of the script.

Sed Conditional Branch Syntax:

$ sed ':label command(s) t label'
  • :label - specification of label.
  • commands - Any sed command(s)
  • label - Any Name for the label
  • t label – jumps to the label only if the last substitute command modified the pattern space. If label is not specified, then jumps to the end of the script.

Create a sample test file

Let us first create thegeekstuff.txt file that will be used in the examples mentioned below.

$ cat thegeekstuff.txt
Linux
        Administration
        Scripting
                Tips and Tricks
Windows
        Administration
Database
        Administration of Oracle
        Administration of Mysql
Security
        Network
                 Online\
        Security
Productivity
        Google Search\
        Tips
        "Web Based Time Tracking,
        Web Based Todo list and
        Reduce Key Stores etc"
$

I. Sed Examples for Unconditional Branch

Sed Example 1. Replace the first occurrence of a pattern in a whole file

In the file thegeekstuff.txt replace the first occurrence of “Administration” to “Supervision”.

$ sed '/Administration/{
 s/Administration/Supervision/
 :loop
 n
 b loop
 }' thegeekstuff.txt
Linux
	Supervision
        Scripting
                Tips and Tricks
Windows
        Administration
Database
        Administration of Oracle
        Administration of Mysql
Security
        Network
                 Online\
        Security
Productivity
        Google Search\
        Tips
        "Web Based Time Tracking,
        Web Based Todo list and
        Reduce Key Stores etc"
  1. In the above sed command, it just read line by line and prints the pattern space till Administration occurs.
  2. Once Administration occurs, substitute Administration to Supervision (only single occurrence, note that no ‘g’ flag in substitution).
  3. Once the first occurrence has been replaced, just read the remaining file content and print.
    • “n” is a sed command which prints the pattern space and overwrite it with next line.
    • Used “loop” as a label. “n” prints the current line and overwrite pattern space with the next line. b loop jumps to the :loop again. So this loop prints the remaining content of thegeekstuff.txt.

Sed Example 2. Remove the data between pattern ” ” in a whole file

In our example file there are three lines between “”.

sed -e ':loop
$!{
N
/\n$/!b loop
}
s/\"[^\"]*\"//g' thegeekstuff.txt
Linux
        Administration
        Scripting
                Tips and Tricks
Windows
        Administration
Database
        Administration of Oracle
        Administration of Mysql
Security
        Network
                 Online\
        Security
Productivity
        Google Search\
        Tips
$
  1. Above command keep appends all the lines of a file till end of file occurs.
    • $! – If its not a end of file.
    • N – Append the next line with the pattern space delimited by \n
    • /\n$/!b loop – If this is not the last line of the file jump to the loop again.
  2. Now all the lines will be available in pattern space delimited by newline. Substitute all the occurrence of data between ” with the empty.

Sed Example 3. Remove the HTML tags of a file

Let us say, I have a file with the following html content

$ cat index.html
<html><body>
<table
border=2><tr><td valign=top
align=right>1.</td>
<td>Line 1 Column 2</
td>
</table>
</body></html>

The following sed command removes all the html tags from the given file

$ sed '/</{
:loop
s/<[^<]*>//g
/</{
N
b loop
}
}' index.html

1.
Line 1 Column 2
  1. Each time find a line contains ‘<’, first remove all HTML tags of that line.
  2. If now the pattern space contains ‘<’, this implies a multi-line tag. Now repeat the following loop:
    • Join next line
    • Remove all HTML tags until no single ‘<’ exists
  3. When no ‘<’ exists in the pattern space, we print it out and start a new cycle.

II. Sed Examples for Conditional Branch

Sed Example 4. If a line ends with a backslash append the next line to it.

Our example file has two lines ends with backslash, now we have to append its next line to it.

$ sed '
:loop
/\\$/N
s/\\\n */ /
t loop' thegeekstuff.txt
Linux
        Administration
        Scripting
                Tips and Tricks
Windows
        Administration
Database
        Administration of Oracle
        Administration of Mysql
Security
        Network
                 Online Security
Productivity
        Google Search Tips
        "Web Based Time Tracking,
        Web Based Todo list and
        Reduce Key Stores etc"
  1. Check if the line ends with the backslash (/\\$/), if yes, read and append the next line to pattern space, and substitute the \ at the end of the line and number of spaces followed by that, with the single space.
  2. If the substitution is success repeat the above step. The branch will be executed only if substitution is success.
  3. Conditional branch mostly used for recursive patterns.

Sed Example 5. Commify a numeric strings.

 sed '
 :loop
 s/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/
 t loop'
12342342342343434
12,342,342,342,343,434
  1. Group the digits into two groups.
  2. The first group is all the digits up to last three digits. The last three digits gets captures in the 2nd group.
  3. Then the two matching groups get separated by a comma. Then the same rules get applied to the line again and again until all the numbers have been grouped in groups of three.
  4. For example, in the first iteration it will be 12342342342343,434
  5. In the next iteration 12342342342,343,434 and goes on till there are less than three digits.

Sed Example 6. Formatting : Replace every leading space of a line with ‘+’

$ sed '
s/^ */&\n/
:loop
s/^\n//;s/ \n/\n+/
t loop' test
Linux
++++++++Administration
++++++++Scripting
++++++++++++++++Tips and Tricks
Windows
++++++++Administration
Database
++++++++Administration of Oracle
++++++++Administration of Mysql
Security
++++++++Network
+++++++++++++++++Online\
++++++++Security
Productivity
++++++++Google Search\
++++++++Tips
++++++++"Web Based Time Tracking,
++++++++Web Based Todo list and
++++++++Reduce Key Stores etc"
  1. Seperate all the leading spaces and other characters of a line with a newline character.
  2. Now replace space and newline with newline and +. So from right to left space will be replaced by + and newline will be moved left for one character.
  3. At last in the beginning of the line \n will be there, so remove that new line.

Linux Sysadmin Course Linux provides several powerful administrative tools and utilities which will help you to manage your systems effectively. If you don’t know what these tools are and how to use them, you could be spending lot of time trying to perform even the basic administrative tasks. The focus of this course is to help you understand system administration tools, which will help you to become an effective Linux system administrator.
Get the Linux Sysadmin Course Now!

If you enjoyed this article, you might also like..

  1. 50 Linux Sysadmin Tutorials
  2. 50 Most Frequently Used Linux Commands (With Examples)
  3. Top 25 Best Linux Performance Monitoring and Debugging Tools
  4. Mommy, I found it! – 15 Practical Linux Find Command Examples
  5. Linux 101 Hacks 2nd Edition eBook Linux 101 Hacks Book

Bash 101 Hacks Book Sed and Awk 101 Hacks Book Nagios Core 3 Book Vim 101 Hacks Book

{ 5 comments… read them below or add one }

1 Catalin December 24, 2009 at 12:51 pm

This is a good article . I like it !

2 Berry December 25, 2009 at 2:18 am

In Example 2,
4th line, ‘/\n$/!b loop’, the article said it checks if it comes to the last line of the file

but actuallty, it checks whether it is a blank line, because when a blank line appends to the pattern buffer, the pattern buffer only adds a ‘\n’ to seperate the orginal pattern buffer and the new blank line, then the buffer ends with a ‘\n’

it means that the script will not loop if it sees a blank line, so if the file is (for example):

abcd
“efg
blank line
hij”

the script will not remove the text between multiline double quotes

and because the script has checked if it comes to the last line in 2nd line of the script, I wonder can we use ‘b loop’ directly in the 4th line of the script

Thank you very much for sharing so great things with us, ^_^

3 cjk December 31, 2009 at 4:21 pm

Writing larger pieces in sed can quickly get awkward (no pun intended). In those cases I recommend Perl, also because it has PCRE which are a lot more logical than the REs of the old tools.

4 Ajay January 28, 2012 at 9:57 pm

I also recommend perl.

5 Manu March 20, 2013 at 11:00 pm

For beginner, this seems to be difficult. Can you please suggest something for beginner to start with for learning sed/awk?

Leave a Comment

Previous post:

Next post: