'Removing leading, trailing and multiple spaces within a string
I would like to remove all leading and trailing spaces. As well as replace multiple spaces with a single space within a string, so that all words in a string are separated exactly by single space.
I could achieve this using following two iteration of regex and looking for single regex solution.
s/^\s+|\s+$//g
s/\s+/ /g
Sample Input:
word1 word2 word3 word4
Desired Output:
word1 word2 word3 word4
It would be appreciable if you could help me to solve this.
Solution 1:[1]
You can use something like:
s/^\s+|\s+$|\s+(?=\s)//g
\s+(?=\s)
will match all the spaces in the middle of the string and leave one.
Solution 2:[2]
In Javascript, the string prototype has two methods that can manage this:
str = str.trim().replace(/\s+/g, ' ')
str.trim()
— removes leading and trailing spaces (and returns a new string without modifying the original)
str.replace(regex, replacement)
— compares regex
against the provided string, replaces matched instances with replacement
, then returns the result as a new string.
In my example, the regex is delimited with slashes (/regex/
) and then g
is appended, indicating we want to globally replace every matched instance. Without that 'g' flag, it will to just replace the first match.
Note: The first argument of .replace()
should not be encapsulated with quotes if you want it to be interpreted as a regular expression.
\s+
matches multiple spaces in a row
example:
let singleSpace = (sloppyStr) => {
let cleanStr = sloppyStr.trim().replace(/\s+/g, ' ');
console.log(cleanStr)
}
singleSpace(' 1 2 3 4 ')
yields: '1 2 3 4'
regex: kleene operators will help you understand the regex used to match multiple spaces
Learn more:
regex: helpful guide on regex and /g flag
Google: MDN string trim
Google: MDN string replace
Solution 3:[3]
Using awk
echo " word1 word2 word3 word4 " | awk '{$1=$1}1'
word1 word2 word3 word4
This $1=$1
is a trick to concentrate everything.
You can even use
awk '$1=$1' file
But if first field is 0
or 0.0
it will fail
Solution 4:[4]
This might work for you (GNU sed):
sed -r 's/((^)\s*(\S))|((\S)\s*($))|(\s)\s*/\2\3\5\6\7/g' file
or simply:
sed -r 's/(^\s*(\S))|((\S)\s*$)|(\s)\s*/\2\4\5/g file
Solution 5:[5]
If you are on UNIX, you could take advantage of the shell's Word-splitting. Bash example using command substitution below
STR=" word1 word2 word3 word4 "
z=$(echo $STR)
echo "$z"
word1 word2 word3 word4
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Jerry |
Solution 2 | |
Solution 3 | Jotne |
Solution 4 | |
Solution 5 | iruvar |