'R function to start new line every n words?

I want to create an R function that inserts a "\n" after every n words in a string (where n is an argument).

e.g.

startstring <- "I like to eat fried potatoes with gravy for dinner."

myfunction(startstring, 4)

would give:

"I like to eat\nfried potatoes with gravy\nfor dinner."

I believe that to do this I need to split the string up into several parts, each n words long, and then paste these together with a separator of "\n". However I do not know how to do the initial splitting step.

Can anyone advise?



Solution 1:[1]

You could solve this with regular expressions, or with this abomination:

words = strsplit(startstring, ' ')[[1L]]
splits = cut(seq_along(words), breaks = seq(0L, length(words) + 4L, by = 4L))
paste(lapply(split(words, splits), paste, collapse = ' '), collapse = '\n')

But a better way for most practical applications is to use strwrap to wrap the text at a given column length, rather than by word count:

paste(strwrap(startstring, 20), collapse = '\n')

Solution 2:[2]

You can use below code:

gsub("([a-z0-9]* [a-z0-9]* [a-z0-9]* [a-z0-9]*) ", "\\1\n", startstring)

Solution 3:[3]

You can use gsub to create an R function that inserts a \n after every n words, where n is an argument.

fun <- function(str, n) {gsub(paste0("([^ ]+( +[^ ]+){",n-1,"}) +"),
                              "\\1\n", str)}
fun(startstring, 4)
#[1] "I like to eat\nfried potatoes with gravy\nfor dinner."
fun(startstring, 2)
#[1] "I like\nto eat\nfried potatoes\nwith gravy\nfor dinner."

Where [^ ]+ matches everything but not a space with a length of at least one. ( +[^ ]+){3} matches at least one spache + followed by at least one not space [^ ]+ which will in this case be repeated 3 times {3}.

Or an alternative using \\K in the pattern instead of \\1 in x:

fun <- function(str, n) {gsub(paste0("[^ ]+( +[^ ]+){",n-1,"}\\K +"),
                              "\n", str, perl=TRUE)}

or by using strsplit:

fun2 <- function(str, n) {
  paste0(strsplit(str, " +")[[1L]], c(rep(" ",n-1),"\n"), collapse = "")}
fun2(startstring, 4)
#[1] "I like to eat\nfried potatoes with gravy\nfor dinner. "

or without a space or \n in the end:

fun3 <- function(str, n) {
  . <- strsplit(str, " +")[[1L]]
  paste0(., c(rep_len(c(rep(" ",n-1),"\n"), length(.)-1), ""), collapse = "")}
fun3(startstring, 4)
#[1] "I like to eat\nfried potatoes with gravy\nfor dinner."

Or keeping the matched words using \K also in strsplit:

fun4 <- function(str, n) {paste(strsplit(str,
   paste0("[^ ]+( +[^ ]+){",n-1,"}\\K +"), perl=TRUE)[[1L]], collapse="\n")}
fun4(startstring, 4)
#[1] "I like to eat\nfried potatoes with gravy\nfor dinner."

Solution 4:[4]

This uses spaces to seperate words, in Base-R

gsub("(\\S* \\S* \\S* \\S*) ","\\1\n",startstring) 
[1] "I like to eat\nfried potatoes with gravy\nfor dinner."

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3
Solution 4 Daniel O