'java regex for UUID

I want to parse a String which has UUID in the below format

"<urn:uuid:4324e9d5-8d1f-442c-96a4-6146640da7ce>"

I have tried it parsing in below way, which works, however I think it would be slow

private static final String reg1 = ".*?";
private static final String reg2 = "([A-Z0-9]{8}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{12})";
private static final Pattern splitter = Pattern.compile(re1 + re2, Pattern.CASE_INSENSITIVE | Pattern.DOTALL);

I am looking for a faster way and tried below, but it fails to match

private static final Pattern URN_UUID_PATTERN = Pattern.compile("^< urn:uuid:([^&])+&gt");

I am new to regex. any help is appreciated.

\Aqura



Solution 1:[1]

Your example of a faster regex is using a < where the input is &lt; so that's confusing.

Regarding speed, first, your UUID is hexadecimal, so don't match with A-Z but rather a-f. Second you give no indication that case is mixed, so don't use case insensitive and write the correct case in the range.

You don't explain if you need the part preceding the UUID. If not, don't include .*?, and you may as well write the literals for re1 and re2 together in your final Pattern. There's no indication you need DOTALL either.

private static final Pattern splitter =
  Pattern.compile("[a-f0-9]{8}(?:-[a-f0-9]{4}){4}[a-f0-9]{8}");

Alternatively, if you are measuring your Regular Expression's performance to be too slow, you might try another approach, for example:
Is each uuid preceded by "uuid:" as in your example? If so you can

  1. find the first index of "uuid:" as i, then
  2. substring 0 to i+5 [assuming you needed it at all], and
  3. substring i+5 to i+41, if I counted that right (36 characters in length).

Along similar lines your faster regex could be:

private static final Pattern URN_UUID_PATTERN =
    Pattern.compile("^&lt;urn:uuid:(.{36})&gt;");

OTOH if all your input strings are going to start with those exact characters, no need to do step 1 in the previous suggestion, just input.substring(13, 49);

Solution 2:[2]

If this format don't be changed. I think more fast way is use String.substring() method. Example:

String val = "&lt;urn:uuid:4324e9d5-8d1f-442c-96a4-6146640da7ce&gt;";
String sUuid = val.substring(13, 49);
UUID uuid =  UUID.fromString(sUuid);

Inside class String used char array for store data, in package java.lang.String:

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
...
113: /** The value is used for character storage. */
114: private final char value[];
...
}

Method 'String substring(int beginIndex, int endIndex)' make the copy of array elements, from start to end index, and create new String on basis new array. Copying of array it is a very fast operation.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 sigpwned
Solution 2 Alexander du Sautoy