'Extract string between combination of words and characters [duplicate]

I would like to keep the strings between (FROM and as), and (From and newline character).

Input:

FROM some_registry as registry1
FROM another_registry

Output:

some_registry
another_registry

Using the following sed command, I can extract the strings. Is there a way to combine the two sed commands?

sed -e 's/.*FROM \(.*\) as.*/\1/' | sed s/"FROM "//



Solution 1:[1]

Merging into one regex expression is hard here because POSIX regex does not support lazy quantifiers.

With GNU sed, you can pass the command as

sed 's/.*FROM \(.*\) as.*/\1/;s/FROM //' file

See this online demo.

However, if you have a GNU grep you can use a bit more precise expression:

#!/bin/bash
s='FROM some_registry as registry1
From another_registry'
grep -oP '(?i)\bFROM\s+\K.*?(?=\s+as\b|$)' <<< "$s"

See the online demo. Details:

  • (?i) - case insensitive matching ON
  • \b - a word boundary
  • FROM - a word
  • \s+ - one or more whitespaces
  • \K - "forget" all text matched so far
  • .*? - any zero or more chars other than line break chars as few as possible
  • (?=\s+as\b|$) - a positive lookahead that matches a location immediately followed with one or more whitespaces and then a whole word as, or end of string.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Wiktor Stribiżew