'Best way to sort bracketed values in a string both alphabetically and by number?
I have an ArrayList that contains a list of JSONpaths in String format, similar to something like the following:
"$['book'][0]['title']"
"$['book'][1]['title']"
"$['book'][2]['title']"
...
"$['book'][10]['title']"
"$['movie']['series'][0]['title']"
"$['movie']['series'][1]['title']"
"$['movie']['series'][2]['title']"
...
"$['movie']['series'][10]['title']"
Don't worry about the actual data here, it's garbage, the ultimate structure is $ with a series of brackets where sometimes the value in between the brackets is an alphabetic string and sometimes it is a number.
I need to sort those ArrayList items in order, where if the node in between the brackets is in quotes it is sorted alphabetically, but if it is outside quotes, it should be sorted as a number. I.e. the order should be:
"$['book'][0]['title']"
"$['book'][1]['title']"
"$['book'][2]['title']"
"$['book'][10]['title']"
"$['movie']['series'][0]['title']"
"$['movie']['series'][1]['title']"
"$['movie']['series'][2]['title']"
"$['movie']['series'][10]['title']"
not:
"$['book'][0]['title']"
"$['book'][1]['title']"
"$['book'][10]['title']"
"$['book'][2]['title']"
"$['movie']['series'][0]['title']"
"$['movie']['series'][1]['title']"
"$['movie']['series'][10]['title']"
"$['movie']['series'][2]['title']"
I expect I need to implement a custom Comparator for this, but I'm having trouble coming up with the most efficient way to parse these strings and do the sorting. Does anyone have a suggested approach for this?
Solution 1:[1]
Custom comparator for the strings should convert the input strings into lists/arrays which could be compared then by the appropriate elements at the same indexes:
String[] data = {
"$['book'][0]['title']",
"$['book'][0]['title']['part1']",
"$['book'][0]['title']['2']",
"$['book'][1]['title']",
"$['book'][prequel]['title']",
"$['book'][sequel]['title']",
"$['movie']['series'][1]['title']",
"$['book'][10]['title']",
"$['movie']['series'][10]['title']",
"$['book'][2]['title']",
"$['movie']['series'][0]['title']",
"$['movie']['series'][2]['title']"
};
Arrays.sort(data, MyClass::customCompare);
Method customCompare
should call convertor and apply conditional comparison "for integers" if needed:
private static int customCompare(String s1, String s2) {
List<String> l1 = convert(s1);
List<String> l2 = convert(s2);
int res = 0;
for (int i = 0, n = Math.min(l1.size(), l2.size()); res == 0 && i < n; i++) {
String e1 = l1.get(i);
String e2 = l2.get(i);
if (e1.matches("\\d+") && e2.matches("\\d+")) {
res = Integer.compare(Integer.valueOf(e1), Integer.valueOf(e2));
} else {
res = e1.compareTo(e2);
}
}
return res != 0 ? res : Integer.compare(l1.size(), l2.size());
}
Method convert
cleans out unnecessary characters and creates a list of strings from the input:
private static List<String> convert(String str) {
return Arrays.stream(str.split("[$'\\[\\]]"))
.filter(x -> !x.isEmpty())
.collect(Collectors.toList());
}
Output after sorting:
$['book'][0]['title']
$['book'][0]['title']['2']
$['book'][0]['title']['part1']
$['book'][1]['title']
$['book'][2]['title']
$['book'][10]['title']
$['book'][prequel]['title']
$['book'][sequel]['title']
$['movie']['series'][0]['title']
$['movie']['series'][1]['title']
$['movie']['series'][2]['title']
$['movie']['series'][10]['title']
Solution 2:[2]
I would recommend the following steps:
- Split the path on '['
- Loop through each element (we would be looping through two paths simultaneously, the lengths aren't guaranteed to be the same)
- If the element begins with ', you know it is a word, otherwise, a number
- If the elements are of the same type, return the sorting appropriately. (You'll need to parse the integers before sorting, don't forget to remove the ']' first)
- If the elements are of different types, return the indication that the number comes first.
- Repeat until one of the paths reaches the end
- After the loop, return the indication that the shorter path comes first
This solution isn't incredible per se, but is O(n) in both space and time. It should be less complex time-wise than the sorting itself, and no more complex in space.
Solution 3:[3]
It's a little ugly, but this will work. Remove the leading "$[" and the trailing "]", and then split on "][" (because String::split uses regular expressions, I have escaped the open bracket as "]\\["
).
Then loop through the values and compare them as strings if they begin with a single quote, or as integers if they do not. If they are of different types I am just comparing the first character. This will put the strings before the numbers. If you want, switch this around. Also, if one path is longer than the other, it only compares up to the smaller of the two lengths. You may want to have either the longer or shorter path come first when all the elements of the shorter one (the prefix) are equal to the longer one, I don't know.
Comparator<String> jsonpath = (path1, path2) -> {
String[] j1 = path1.substring(2, path1.length() - 1).split("]\\[");
String[] j2 = path2.substring(2, path2.length() - 1).split("]\\[");
for (int i = 0; i < Math.min(j1.length, j2.length); ++i) {
int c;
char c1 = j1[i].charAt(0);
char c2 = j2[i].charAt(0);
if (c1 == '\'' && c2 == '\'') {
c = j1[i].compareTo(j2[i]);
} else if (c1 != '\'' && c2 != '\'') {
c = Integer.valueOf(j1[i]).compareTo(Integer.valueOf(j2[i]));
} else c = c1 - c2;
if (c != 0) return c;
}
return 0;
};
Solution 4:[4]
To split each row of string according to its data type and sort it, the Java code will be lengthy.
Suggest you using SPL, an open-source Java package, to do it. Two lines of code are enough:
A | |
---|---|
1 | =file("data.txt").read@n() |
2 | =A1(A1.(mid(~,4,len(~)-5).split@p("][")).psort()) |
SPL offers JDBC driver to be invoked by Java. Just store the above SPL script as sort.splx and invoke it in a Java application as you call a stored procedure:
…
Class.forName("com.esproc.jdbc.InternalDriver");
con= DriverManager.getConnection("jdbc:esproc:local://");
st = con.prepareCall("call sort()");
st.execute();
…
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Nowhere Man |
Solution 2 | Stephen Leach |
Solution 3 | |
Solution 4 | LeoTaylor |