The Sed Text Processor
Sed (stream editor) is a utility that does transformations on a line-by-line basis. The commands you give it are run on each line of input in turn. It is useful both for processing files and in a pipe to process output from other programs, such as here:
$ wc -c * | sort -n | sed ...
Basic Syntax and Substitution
A common use of Sed is to change words within a file. You may have used "Find and Replace" in GUI based editors. Sed can do this much more powerfully and faster:
$ sed "s/foo/bar/g" inputfile > outputfile
Let's break down this simple command. First we tell the shell to run sed
. The processing we want to do is enclosed in double quotation marks; we'll come back to that in a moment. We then tell Sed the name of the inputfile and use standard shell redirection (>) to the name of our outputfile. You can specify multiple input files if you want; Sed processes them in order and creates a single stream of output from them.
The expression looks complex but is very simple once you learn to take it apart. The initial "s" means "substitute". This is followed by the text you want to find and the replacement text, with slashes (/) as separators. Thus, here we want to find "foo" in the inputfile and put "bar" in its places. Only the output file is affected; Sed never changes its input files.
Finally, the trailing "g" stands for "global", meaning to do this for the whole line. If you leave off the "g" and "foo" appears twice on the same line, only the first "foo" is changed to "bar".
$ cat testfile this has foo then bar then foo then bar this has bar then foo then bar then foo $ sed "s/foo/bar/g" testfile > testchangedfile $ cat testchangedfile this has bar then bar then bar then bar this has bar then bar then bar then bar
Now let's try that again without the /g
on the command and see what happens.
$ cat testfile this has foo then bar then foo then bar this has bar then foo then bar then foo $ sed "s/foo/bar/" testfile > testchangedfile $ cat testchangedfile this has bar then bar then foo then bar this has bar then bar then bar then foo
Notice that without the "g", Sed performed the substitution only the first time it finds a match on each line.
This is all well and good, but what if you wanted to change the second occurrence of the word foo in our testfile? To specify a particular occurrence to change, just specify the number after the substitute commands.
$ sed "s/foo/bar/2" inputfile > outputfile
You can also combine this with the g
flag (in some versions of Sed) to leave the first occurrence alone and change from the 2nd occurrence to the end of the line.
$ sed "s/foo/bar/2g" inputfile > outputfile
Sed Expressions Explained
Sed understands regular expressions, to which a chapter is devoted in this book. Here are some of the special characters you can use to match what you wish to substitute.
$ matches the end of a line ^ matches the start of a line * matches zero or more occurrences of the previous character [ ] any characters within the brackets will be matched
For example, you could change any instance of the words "cat", "can", and "car" to "dog" by using the following:
$ sed "s/ca[tnr]/dog/g" inputfile > outputfile
In the next example, the first [0-9] ensures that at least one digit must be present to be matched. The second [0-9] may be missing or may be present any number of times, because it is followed by the * metacharacter. Finally, the digits are removed because there is nothing between the second and third slashes where you can put your replacement text.
$ sed "s/[0-9][0-9]*//g" inputfile > outputfileInside an expression, if the first character is a caret (^), Sed matches only if the text is at the start of the line.
$ echo dogs cats and dogs | sed "s/^dogs/doggy/" doggy cats and dogs
A dollar sign at the end of a pattern expression tells Sed to match the text only if it is at the end of the line.
$ echo dogs cats and cats | sed "s/cats$/kitty/" dogs cats and kitty
A line changes only if the matching string is where you require it to be; if the same text occurs elsewhere in the sentence it is not be modified.
Deletion
The "d" command deletes an entire line that contains a matching pattern. Unlike the "s" (substitute) command, the "d" goes after the pattern.
$ cat testfile line with a cat line with a dog line with another cat $ sed "/cat/d" testfile > newtestfile $ cat newtestfile line with a dog
The regular expression ^$ means "match a line that has nothing between the beginning and the end", in other words, a blank line. So you can remove all blank lines using the "d" command with that regular expression:
$ sed "/^$/d" inputfile > outputfile
Controlling Printing
Suppose you want to print certain lines and suppress the rest. That is, instead of specifying which lines to delete using "d", you want specify which lines to keep.
This can be done with two features:
Specify the -n
option, which means "do not print lines by default".
End the pattern with "p" to print the line matched by the pattern.
We'll show this with a file that contains names:
$ cat testfile Mr. Jones Mrs. Jones Mrs. Lee Mr. Lee
We've decided to standardize on "Ms" for women, so we want to change "Mrs." to "Ms". The pattern is:
s/Mrs\./Ms/
and to print only the lines we changed, enter:
$ sed -n "s/Mrs\./Ms/p" testfile
Multiple Patterns
Sed can be passed more than one operation at a time. We can do this by specifying each pattern after an -e
option.
$ echo Gnus eat grass | sed -e "s/Gnus/Penguins/" -e "s/grass/fish/" Penguins eat fish.
Controlling Edits With Patterns
We can also be more specific about which lines a pattern gets applied to. By supplying a pattern before the operation, you restrict the operation to lines that have that pattern.
$ cat testfile one: number two: number three: number four: number one: number three: number two: number $ sed "/one/ s/number/1/" testfile > testchangedfile $ cat testchangedfile one 1 two: number three: number four: number one: 1 three: number two: number
The sed
command in that example had two patterns. The first pattern, "one", simply controls which lines Sed changes. The second pattern replaces "number" with "1" on those lines.
This works with multiple patterns as well.
$ cat testfile one: number two: number three: number four: number one: number three: number two: number $ sed -e "/one/ s/number/1/" -e "/two/ s/number/2/" \ -e "/three/ s/number/3/" -e "/four/ s/number/4/" \ < testfile > testchangedfile $ cat testchangedfile one: 1 two: 2 three: 3 four: 4 one: 1 three: 3 two: 2
Controlling Edits With Line Numbers
Instead of specifying patterns that can operate on any line, we can specify an exact line or range of lines to edit.
$ cat testfile even number odd number odd number even number $ sed "2,3 s/number/1/" < testfile > testchangedfile $ cat testchangedfile even number odd 1 odd 1 even number
The comma acts as the range separator, telling Sed to work only on lines two through three.
$ cat testfile even number odd number odd number $ sed -e "2,3 s/number/1/" -e "1 s/number/2/" < testfile > testchangedfile $ cat testchangedfile even 2 odd 1 odd 1
Sometimes you might not know exactly how long a file is, but you want to go from a specified line to the end of the file. You could use wc
or the like and count the total lines, but you can also use a dollar sign ($) to represent the last line:
$ sed "25,$ s/number/1/" < testfile > testchangedfile
The $ in an address range is Sed's way of specifying, "all the way to the end of the file".
Scripting SED commands
By using the -f
argument to the sed
command, you can feed Sed a list of commands to run. For example, if you put the following patterns in a file called sedcommands:
s/foo/bar/g s/dog/cat/g s/tree/house/g s/little/big/g
You can use this on a single file by entering the following:
$ sed -f sedcommands < inputfile > outputfile
Note that each command in the file must be on a separate line.
There is much more to Sed than can be written in this chapter. In fact, whole books have been written about Sed, and there are many excellent tutorials about Sed online.