'How to grab text after newline in a text file no clean of spaces, tabs [closed]
Assume this: It needs to pass a file name as an argument.
This is the only text I’m showing. The remaining text has more data (not shown). The problem: The text is semi-clean, full of whitespace, tabs, Unicode, isn't clean and has to be like this (my needs), so copy/paste this exact text doesn't work (formatted by markup):
I have some text like this:
*** *
more text with spaces and tabs
*****
1
Something here and else, 2000 edf, 60 pop
Usd324.32 2 Usd534.22
2
21st New tetx that will like to select with pattern, 334 pop
Usd162.14
*** *
more text with spaces and tabs, unicode
*****
I'm trying to grab this explicit text:
1 Something here and else, 2000 edf, 60 pop Usd324.32
because of the newline
and whitespace
, the next command only grabs 1
:
grep -E '1\s.+'
Also, I have been trying to make it with new concatenations:
grep -E '1\s|[A-Z].+'
But it doesn't work. grep
begins to select a similar pattern in different parts of the text:
awk '{$1=$1}1' #done already
tr -s "\t\r\n\v" #done already
tr -d "\t\b\r" #done already
How can I grab:
- grab one
newline
- grab the whole second line after one
newline
- grab the number
$Usd324.34
and removeUsd
Solution 1:[1]
You can use this sed
:
sed -En '/^1/ {N;N;s/[[:blank:]]*Usd([^[:blank:]]+)[^\n]*$/\1/; s/\n/ /gp;}' file
1 Something here and else, 2000 edf, 60 pop 324.32
Or this awk
would also work:
awk '$0 == 1 {
printf "%s", $0
getline
printf " %s ", $0
getline
sub(/Usd/, "")
print $1
}' file
1 Something here and else, 2000 edf, 60 pop 324.32
Solution 2:[2]
Pure Bash:
#! /bin/bash
exec <<EOF
*** *
more text with spaces and tabs
*****
1
Something here and else, 2000 edf, 60 pop
Usd324.32 2 Usd534.22
2
21st New tetx that will like to select with pattern, 334 pop
Usd162.14
*** *
more text with spaces and tabs, unicode
*****
EOF
while read -r line1; do
if [[ $line1 =~ ^1$ ]]; then
read -r line2
read -r line3col1 dontcare
printf '%s %s %s\n' "$line1" "$line2" "${line3col1#Usd}"
fi
done
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | anubhava |
Solution 2 | ceving |