'How to filter on the number of rows in the value column of a group-by in nushell?
I want to do the equivalent of a HAVING clause in SQL. The real life description is I'm trying to find Elasticsearch aliases that point to two(or more) indexes and their index names. The data looks like this.
I first do a group by, then I pivot them to get rows, where the first column is the group key and the second column is a table with n rows. The command to get this table is open file.txt | lines | split-column " " --collapse-empty Alias Index | group by Alias | pivot
──┬───────────────┬────────────────
# │ Column0 │ Column1
───┼───────────────┼────────────────
0 │ abcd_20200430 │ [table 1 rows]
1 │ abcd_20200501 │ [table 3 rows]
2 │ abcd_20200502 │ [table 2 rows]
3 │ abcd_20200503 │ [table 1 rows]
4 │ abcd_20200504 │ [table 1 rows]
I want to filter this table by the rows where Column1 has more than 1 row. How would I do that?
| where Column1.count > 1
doesn't work
Solution 1:[1]
You could try adding another column with their sizes as a workaround. Something like this:
open file.txt | lines | split-column " " --collapse-empty Alias Index | group by Alias | pivot | default size 0 | update size { get Column1 | count }
Solution 2:[2]
In recent versions of Nushell you should be able to use where
and length
like so:
where ($it.Column1 | length) > 1
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Andrés N. Robalino |
Solution 2 | Reilly Wood |