Thursday, August 11, 2005

bre,ere and pcre

Its been a while since I had a look at "Mastering Regular Expressions" by Jeff Friedl but I am glad i went through it once.

So the input is like


So my first attempt at extracting the url was

$cat output| sed 's/^.*filename=//'

So someone pointed out that the url could contain 'filename=',yeah thats quite possible and proposed a solution with back-references as sed doesnt have non greedy quantifiers which are common to pcre(perl compatible regular expressions).Sed only supports bre(basic regular expressions) and ere(extended regular expression).Dont ask me what they are :).

$cat output| sed -r 's/^[[:space:]]*filename="([^"]*)"[[:space:]]*$/\1/g

In addition to making space for "spaces" :),this moves the greedy "*" to the left hand side of "filename=" so that any extra "filename=" are matched by the greedy "*".

The same thing in perl could be simply done by

$cat output|perl -wpe 's/^.*?filename=//;'


Post a Comment

<< Home