'Extract string between the tags in Java

I have string like below

Msg_Begin
Some message1
Msg_End
Msg_Begin
Some message2
Msg_End
Msg_Begin
Some message3
Msg_End

And want to get the message between Msg_Begin and Msg_End in to the list like

[Some message1, Some message2, Some message3]

what is the best approach for this in Java.



Solution 1:[1]

var messages = originalString.replaceAll("Msg_Begin", "");
var array = messages.split("Msg_End");
return Arrays.asList(array);

Just make sure that your messages do not contain Msg_Begin or Msg_End.

Solution 2:[2]

You can achieve that with a regular expression :

//Filling Your test case and print
String entry = "Msg_Begin\r\n" + 
               "Some message1\r\n" + 
               "Msg_End\r\n" + 
               "Msg_Begin\r\n" + 
               "Some message2\r\n" + 
               "Msg_End\r\n" + 
               "Msg_Begin\r\n" + 
               "Some message3\r\n" + 
               "Msg_End";

System.out.println("IN : \r\n" + entry) ;

//Compile the regular expression patern, providing the DOTALL flag to enable mutiline matches
Pattern p = Pattern.compile("Msg_Begin\r\n(.+?)\r\nMsg_End(\r\n)?", Pattern.DOTALL) ;  
Matcher m = p.matcher(entry) ; 

// iterate over results (for exemple add them to a list)
System.out.println("\r\nOUT :") ;
List<String> list = new ArrayList<>();
while (m.find()) {
    list.add( m.group(1));
    System.out.println(m.group(1)) ;
}

Produce the following result :

IN : 
Msg_Begin
Some message1
Msg_End
Msg_Begin
Some message2
Msg_End
Msg_Begin
Some message3
Msg_End

OUT :
Some message1
Some message2
Some message3

More informations abour regular expressions syntax can be found here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Patrick Roger