'How to negate a regex for pandas replace

In pandas I can search and replace all fields that contain the word fish, for example, using df.replace(r'.*fish.*', 'foo', regex = True).

But how do I search and replace all fields that don't contain the word fish?

That is in my example replace all fields that don't contain the word fish with the word 'foo'.

For example, say the dataframe is

applefish pear
water     afishfarm

I would like this to be transformed to

applefish foo
foo       afishfarm 


Solution 1:[1]

You can use negative look ahead (?!) assertion; ^(?!.*fish).*$ will firstly assert the pattern doesn't contain the word fish and then match every thing till the end of string and replace it with foo:

  • ^ denotes the beginning of string, combined with (?!.*fish), it asserts at BOS that there is no pattern like .*fish in the string;
  • If the assertion succeeds, it matches everything till the end of string .*$, and replace it with foo; If the assertion fails, the pattern doesn't match, nothing would happen;

so:

df.replace(r'^(?!.*fish).*$', 'foo', regex=True)
#           0           1
#0  applefish         foo
#1        foo   afishfarm

If the string can contain multiple words:

df
#                0          1
#0  applefish pear       pear
#1           water  afishfarm

You can use word boundary \b to replace ^ and word characters \w to replace .:

df.replace(r'\b(?!.*fish)\w+', 'foo', regex=True)
#               0           1
#0  applefish foo         foo
#1            foo   afishfarm

Solution 2:[2]

You can use apply with str.contains

df.apply(lambda x: x.replace(x[~x.str.contains('fish')], 'foo'))

You get

    0           1
0   applefish   foo
1   foo         afishfarm

Note: I wouldn't even recommend this as Psidom's solution is way more efficient.

Solution 3:[3]

Maybe it helps someone with a similar Problem: If you want to filter a DF with a negate regex use it in this way:

new_DF = df.loc[**~**df['columnName'].str.match(r'your regex here')]

If you have None Values there than don't forget:

... match(r'your regex here', na=True)

otherwise you get an ERROR

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Vaishali
Solution 3 pedrojose_moragallegos