RegEx: Exclude lines containing a specific word

As we all know regular expressions are really powerfull. I'm sure every developer needs them at one time or another and is truly happy that they exist. But today they nearly drove me mad.

Half of the time tools like The Regulator and Expresso do the job and you can almost click your needed expression together, but not today. I needed a regex which would exclude a line if it contained a specific word. So in the following example i want all the lines to match, except the line that contains the word 'ignored':

I want this line.
Give me this line also.
This line must be ignored.
Another line that i want.

After trying various expressions i found that the follwing regex (thanks to this site) did the trick:

^((?!ignored).)*$

In regular expression world '(?!regex)' is called a zero-width negative lookahead which means that the expression wil match if it does not contain regex. The rest of the expression, ^( and .)*$ basically tells the regex engine that the word we look for can be surrounded by other characters e.g. inside a line.

Published Wed, Dec 3 2008 3:26 PM by Arjen Bloemsma
Filed under:

Comments

# re: RegEx: Exclude lines containing a specific word

Wednesday, December 10, 2008 12:38 AM by Mischa Kroon

Good one :)

# re: RegEx: Exclude lines containing a specific word

Wednesday, December 31, 2008 5:05 PM by Jack Ashburn

Fantastic! You've got it! Like you, I've needed this every now and then, and this is one of those times.

Just 1 minor improvement:

^(?:(?!ignored).)*$

(?: specifies a non-capturing group, so should be slightly more efficient.

Cheers!

# re: RegEx: Exclude lines containing a specific word

Tuesday, January 20, 2009 3:44 PM by Arjen Bloemsma

@ Jack Ashburn: Thanks for the improvement :)

# re: RegEx: Exclude lines containing a specific word

Tuesday, June 16, 2009 10:01 PM by Mark

Thanks, used this to replace ampersands (&) in some invalid XML in a legacy system:

Regex regex = new Regex(@"&(?!(amp|lt|quot|gt|apos);)");

string rawxml = regex.Replace(methods.Xml, "&");

This prevents from replacing the ampersand on existing escape sequences

# re: RegEx: Exclude lines containing a specific word

Thursday, July 23, 2009 5:03 PM by Richard

Hi All,

I'm not as successfull and I know Im doing something wrong!

My example string contains the following:

set obj = server.createobject

set cobj= server.createobject

set objvar =server.createobject

I want to exclude the word "obj" but making sure it doesn't return the entire line.

I am using the following Regex to try and return only 2 of them:

(?i-:(?!obj)set(?:\x20*|\x09*)\w*(?:\x20*|\x09*)=(?:\x20*|\x09*)server\.createobject)

But it returns all 3 and I am not sure why?

Any help would be very appreciated.

# re: RegEx: Exclude lines containing a specific word

Tuesday, February 22, 2011 12:59 AM by Kathleen

This seems to work if there are no other words made up of the combination of letters in ignored.

I am trying to EXCLUDE a specific number - say 6014.  The source data has 4015, 6041, 0145, etc.  I just want to exclude 6014 - can't figure out how do it!

# re: RegEx: Exclude lines containing a specific word

Thursday, June 16, 2011 11:04 AM by ObsidianSaint

@kathleen

(?!6014\b)\b\w+

That should do the job

Leave a Comment

(required) 
(required) 
(optional)
(required) 
Please add 1 and 1 and type the answer here: