'SQL formula to match all words in any order in just one statement

I would like to have a SQL formula to match all the words in a query in any order without using multiple 'AND like' statements.

For example, the query 'cat dog' should match the following statements: 'cat and dog in a park' 'dog and cat are playing'

I found a solution in Regex:

WHERE query REGEXP concat('\'(?=.*',replace('cat dog',' ',')(?=.*'),')\'')  

Note: the part after REGEXP transforms into '(?=.*cat)(?=.*dog)'

However, I get the error 'repetition-operator operand invalid from regexp'.

Could you please help to find another way to get this to work?

The query is a free field (search box) so there can be many words to match. This is why I'm not using:

WHERE query like '%cat%' AND query like '%dog%'

Thanks a lot in advance!

Kevin



Solution 1:[1]

Well, up to and including simple plural (formed by adding just an 's') could be catered for by:

SELECT
  query
FROM T
WHERE
  CONCAT(',', REPLACE(query, ' ', ','), ',')
  REGEXP CONCAT(',', REPLACE('cat dog', ' ', 's?,|,'), 's?,')
;

See it in action: SQL Fiddle.
This, however, does just check, whether at least one of the search terms is found.

If every single search term (still including simple plural) needs to be found at least once, one could try along

SELECT
  query
FROM T
WHERE
  CASE (LENGTH(@search) - LENGTH(REPLACE(@search, ' ', '')))
  WHEN 0
  THEN
    (CONCAT(',', REPLACE(query, ' ', ','), ',') REGEXP CONCAT(',', @search, 's?,'))
  WHEN 1
  THEN
    (CONCAT(',', REPLACE(query, ' ', ','), ',') REGEXP
     CONCAT(',', SUBSTRING_INDEX(@search, ' ', 1), 's?,'))
    AND
    (CONCAT(',', REPLACE(query, ' ', ','), ',') REGEXP
     CONCAT(',', SUBSTRING_INDEX(@search, ' ', -1), 's?,'))
  WHEN 2
  THEN
    (CONCAT(',', REPLACE(query, ' ', ','), ',') REGEXP
     CONCAT(',', SUBSTRING_INDEX(@search, ' ', 1), 's?,'))
    AND
    (CONCAT(',', REPLACE(query, ' ', ','), ',') REGEXP
     CONCAT(',', SUBSTRING_INDEX(SUBSTRING_INDEX(@search, ' ', 2), ' ', -1), 's?,'))
    AND
    (CONCAT(',', REPLACE(query, ' ', ','), ',') REGEXP
     CONCAT(',', SUBSTRING_INDEX(@search, ' ', -1), 's?,'))
  WHEN 3
  THEN
    (CONCAT(',', REPLACE(query, ' ', ','), ',') REGEXP
     CONCAT(',', SUBSTRING_INDEX(@search, ' ', 1), 's?,'))
    AND
    (CONCAT(',', REPLACE(query, ' ', ','), ',') REGEXP
     CONCAT(',', SUBSTRING_INDEX(SUBSTRING_INDEX(@search, ' ', 2), ' ', -1), 's?,'))
    AND
    (CONCAT(',', REPLACE(query, ' ', ','), ',') REGEXP
     CONCAT(',', SUBSTRING_INDEX(SUBSTRING_INDEX(@search, ' ', 3), ' ', -1), 's?,'))
    AND
    (CONCAT(',', REPLACE(query, ' ', ','), ',') REGEXP
     CONCAT(',', SUBSTRING_INDEX(@search, ' ', -1), 's?,'))
  ELSE FALSE
  END
;

SQL Fiddle

This might be feasible for a limited number of search terms. Not sure, it is really advisable / preferable compared to generating the equivalent LIKE, INSTR or even REGEXP.

Please comment, if and as adjustment / further detail is required.

Solution 2:[2]

I think this is you want.

 Hitesh> select * from test;
+-------------------------+
| name                    |
+-------------------------+
| i am the boss           |
| You will get soon       |
| Happy birthday bro      |
| the beautiful girl      |
| oyee its sunday         |
| cat and dog in a park   |
| dog and cat are playing |
| cat                     |
| dog                     |
+-------------------------+
9 rows in set (0.00 sec)

Hitesh> set @a='cat';
Query OK, 0 rows affected (0.00 sec)

Hitesh> set @b='dog';
Query OK, 0 rows affected (0.00 sec)

Hitesh> set @var=concat(concat('((^(',@a,') .*)|(.* (',@a,') .*))'),'.*',concat('(.* (',@b,') .*)|(','.* (',@b,')$)'));
Query OK, 0 rows affected (0.00 sec)

Hitesh> set @var2=concat(concat('((^(',@b,') .*)|(.* (',@b,') .*))'),'.*',concat('((.* (',@a,') .*)|(','.* (',@a,')$))'));
Query OK, 0 rows affected (0.00 sec)

Hitesh> select @a, @b, @var, @var2;
+------+------+--------------------------------------------------------+----------------------------------------------------------+
| @a   | @b   | @var                                                   | @var2                                                    |
+------+------+--------------------------------------------------------+----------------------------------------------------------+
| cat  | dog  | ((^(cat) .*)|(.* (cat) .*)).*(.* (dog) .*)|(.* (dog)$) | ((^(dog) .*)|(.* (dog) .*)).*((.* (cat) .*)|(.* (cat)$)) |
+------+------+--------------------------------------------------------+----------------------------------------------------------+
1 row in set (0.00 sec)

Hitesh> select * from test where (name REGEXP @var) or (name REGEXP @var2);
+-------------------------+
| name                    |
+-------------------------+
| cat and dog in a park   |
| dog and cat are playing |
+-------------------------+
2 rows in set (0.00 sec)

Solution 3:[3]

You can only get away with

LIKE "%one%two%"

if one is always before two.

if not, you really do need to use AND LIKE, because mysql regexp doesn't support positive lookahead, which is what you need to do to write a regex that does what you want.

sorry to give you the bad news.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 Mr. Beeblebrox