'How to match some words after matching an entire sentence using regex?
I'm a newbie. I'm trying to find the full name in either one of the lines below and without the Obituary for
<h2>Obituary for John Doe</h2>
<h1>James Michael Lee</h1>
My regex is this.
(<h1>(.+?)<\/h1>|<h2>Obituary\sfor\s(.+?)<\/h2>)
What I'm getting is still Obituary for John Doe
. How to remove the Obituary for
?
Solution 1:[1]
Many roads lead to Rome, you can probably do something like this:
<h(?:1>|2>Obituary\sfor\s)\K[^><]+
See this demo at regex101. The matches will be in $out[0]
.
\K
resets beginning of the reported match. See the SO Regex FAQ for more.
Solution 2:[2]
Could you do something like this without using regex?
/**
* @description : Function extracts names from html header tags
* @example : "<h2>Obituary for John Doe</h2><h1>James Michael Lee</h1>" -> ["John Doe", "James Michael Lee"]
* @param $html string
* @return []string : list of full names
*/
function extractFullNames($html) {
$regex = '/<h[1-2]>(.*?)<\/h[1-2]>/';
preg_match_all($regex, $html, $matches);
$names = $matches[1];
$names = array_map('trim', $names);
$names = array_map('strip_tags', $names);
$names = array_map('strtolower', $names);
$names = array_map('ucwords', $names);
$names = array_map('removeObituary', $names);
return $names;
}
/**
* @description : Function used to remove "Obituary For" if present
* @example : "Obituary For John Doe" -> "John Doe"
* @param $name string
* @return string : name without "Obituary For"
*/
function removeObituary($name) {
$name = str_replace("Obituary For ", "", $name);
return $name;
}
// Test cases
$html = '<h2>Obituary for John Doe</h2><h1>James Michael Lee</h1>';
$names = extractFullNames($html);
$expected = ['John Doe', 'James Michael Lee'];
echo "Expected: " . implode(', ', $expected) . "\n";
echo "Actual: " . implode(', ', $names);
Solution 3:[3]
i'd probably do something like
/^(?:\s<[^>]*?>)?(?:.*\s+for\s+)?([^<]*)/
and extract $1
(the first match group).
Solution 4:[4]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | PCDSandwichMan |
Solution 3 | chaos |
Solution 4 | Ryszard Czech |