'PHP - Check if string contains words longer than 4 characters, then include "+ *", and for those shorter than 4 characters include only "*"

I manage to do partially just one part, but cannot make the second part work.

  • If a word has < 4 characters, only * should be included at the end of that word.
  • If a word has >= 4 characters, * should be added on end, and + at the beginning.

The code i made...

$string = "This is a short sentence which should include all regex results";

preg_match_all('/\b[A-Za-z0-9]{4,99}\b/', $string, $result);

echo implode("* +", $result[0]);

will produce the following results...

This* +short* +sentence* +which* +should* +include* +regex* +results

while it should return the following results...

+This* is* a* +short* +sentence* +which* +should* +include* all* +regex* +results*

PS: I want this to improve the flexibility of fulltext search for innodb tables.



Solution 1:[1]

You can use preg_replace with two regexes for replacement, one which matches words with 1-3 letters and one which matches words with 4 or more:

$string = "This is a short sentence which should include all regex results";
echo preg_replace(array('/\b(\w{1,3})\b/', '/\b(\w{4,})\b/'), array('$1*', '+$1*'), $string);

Output:

+This* is* a* +short* +sentence* +which* +should* +include* all* +regex* +results*

Demo on 3v4l.org

Solution 2:[2]

This regex task can be done in a single pass over the input string.

Code: (Demo)

$string = "This is a short sentence which should include all regex results";
echo preg_replace_callback(
         '~(\w{3})?(\w+)~',
         fn($m) => ($m[1] ? "+" : '') . "$m[0]*",
         $string
     );

Output:

+This* is* a* +short* +sentence* +which* +should* +include* all* +regex* +results*

The pattern optionally matches the first three word characters of each "word" -- purely to determine if a plus symbol should be prepended to the replacement. The second capture group is not used in the replacement, but it is declared to ensure that the first capture group always exists (so that iterated `isset() calls are avoided). Then simply use the full string match and append an asterisk to complete the replacement string.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Nick
Solution 2 mickmackusa