Bash Rules: Sed and No DuPlication

I needed to remove duplicate names from a file so I thought sed should be a good choice for it.
uniq is too easy so I found out how sed does it. Herez how


$  sed '$!N; /^\(.*\)\n\1$/!P; D' filename

So now time for an explanation.

1 $!N - sed reads one line at a time and then works on it. It doesn't read in the newline at the end of the line into pattern space (what sed works on). So N command appends the newline and the next line to the pattern space.
$ denotes the last line and ! means NOT, so it means for the last line don't execute N command.Heck as if there is anything to read after the last line..read it if you can for all I care.

2. /^$.*$\n\1$/!P; - If you see start of pattern space and then anything followed by a newline which is followed by exactly that anything. Don't "print the first part of pattern space till the newline"(P) else print first part of pattern space till the newline

3. D - "just delete the first part of pattern space till the newline and restart the command cycle i.e go back to N"

Bash Rules

Wednesday, December 13, 2006

Sed and No DuPlication

0 Comments:

About Me

Previous Posts