'A better way to slice an array (or a list) in powershell
How can i export mails adresses in CSV files in a range of 30 users for each one. I have already try this
$users = Get-ADUser -Filter * -Properties Mail
$nbCsv = [int][Math]::Ceiling($users.Count/30)
For($i=0; $i -le $nbCsv; $i++){
$arr=@()
For($j=(0*$i);$j -le ($i + 30);$j++){
$arr+=$users[$j]
}
$arr|Export-Csv -Path ($PSScriptRoot + "\ASSFAM" + ("{0:d2}" -f ([int]$i)) + ".csv") -Delimiter ";" -Encoding UTF8 -NoTypeInformation
}
It works but, i think there is a better way to achieve this task. Have you got some ideas ?
Thank you.
Solution 1:[1]
Bacon Bits' helpful answer shows how to simplify your code with the help of ..
, the range operator, but it would be nice to have a general-purpose chunking (partitioning, batching) mechanism; however, as of PowerShell 7.0, there is no built-in feature.
GitHub feature suggestion #8270 proposes adding a -ReadCount <int>
parameter to Select-Object
, analogous to the parameter of the same name already defined for Get-Content
.
If you'd like to see this feature implemented, show your support for the linked issue there.
With that feature in place, you could do the following:
$i = 0
Get-ADUser -Filter * -Properties Mail |
Select-Object -ReadCount 30 | # WISHFUL THINKING: output 30-element arrays
ForEach-Object {
$_ | Export-Csv -Path ($PSScriptRoot + "\ASSFAM" + ("{0:d2}" -f ++$i) + ".csv") -Delimiter ";" -Encoding UTF8 -NoTypeInformation
}
In the interim, you could use custom function Select-Chunk
(source code below): replace Select-Object -ReadCount 30
with Select-Chunk -ReadCount 30
in the snippet above.
Here's a simpler demonstration of how it works:
PS> 1..7 | Select-Chunk -ReadCount 3 | ForEach-Object { "$_" }
1 2 3
4 5 6
7
The above shows that the ForEach-Object
script block receive the following
three arrays, via $_
, in sequence:1, 2, 3
, 4, 5, 6
, and , 7
(When you stringify an array, by default you get a space-separated list of its elements; e.g., "$(1, 2, 3)"
yields 1 2 3
).
Select-Chunk
source code:
The implementation uses a [System.Collections.Generic.Queue[object]]
instance to collect the inputs in batches of fixed size.
function Select-Chunk {
<#
.SYNOPSIS
Chunks pipeline input.
.DESCRIPTION
Chunks (partitions) pipeline input into arrays of a given size.
By design, each such array is output as a *single* object to the pipeline,
so that the next command in the pipeline can process it as a whole.
That is, for the next command in the pipeline $_ contains an *array* of
(at most) as many elements as specified via -ReadCount.
.PARAMETER InputObject
The pipeline input objects binds to this parameter one by one.
Do not use it directly.
.PARAMETER ReadCount
The desired size of the chunks, i.e., how many input objects to collect
in an array before sending that array to the pipeline.
0 effectively means: collect *all* inputs and output a single array overall.
.EXAMPLE
1..7 | Select-Chunk 3 | ForEach-Object { "$_" }
1 2 3
4 5 6
7
The above shows that the ForEach-Object script block receive the following
three arrays: (1, 2, 3), (4, 5, 6), and (, 7)
#>
[CmdletBinding(PositionalBinding = $false)]
[OutputType([object[]])]
param (
[Parameter(ValueFromPipeline)]
$InputObject
,
[Parameter(Mandatory, Position = 0)]
[ValidateRange(0, [int]::MaxValue)]
[int] $ReadCount
)
begin {
$q = [System.Collections.Generic.Queue[object]]::new($ReadCount)
}
process {
$q.Enqueue($InputObject)
if ($q.Count -eq $ReadCount) {
, $q.ToArray()
$q.Clear()
}
}
end {
if ($q.Count) {
, $q.ToArray()
}
}
}
Solution 2:[2]
If you want a subset of an array, you can just use ..
, the range operator. The first 30 elements of an array would be:
$users[0..29]
You don't have to worry about going past the end of the array, either. If there are 100 items and you're calling $array[90..119]
, you'll get the last 10 items in the array and no error. You can use variables and expressions there, too:
$users[$i..($i + 29)]
That's the $i
th value and the next 29 values after the $i
th value (if they exist).
Also, this pattern should be avoided in PowerShell:
$array = @()
loop-construct {
$array += $value
}
Arrays are immutable in .Net, and therefore immutable in PowerShell. That means that adding an element to an array with +=
means "create a brand new array, copy every element over, and then put this one new item on it, and then delete the old array." It generates tremendous memory pressure, and if you're working with more than a couple hundred items it will be significantly slower.
Instead, just do this:
$array = loop-construct {
$value
}
Strings are similarly immutable and have the same problem with the +=
operator. If you need to build a string via concatenation, you should use the StringBuilder class.
Ultimately, however, here is how I would write this:
$users = Get-ADUser -Filter * -Properties Mail
$exportFileTemplate = Join-Path -Path $PSScriptRoot -ChildPath 'ASSFAM{0:d2}.csv'
$batchSize = 30
$batchNum = 0
$row = 0
while ($row -lt $users.Count) {
$users[$row..($row + $batchSize - 1)] | Export-Csv ($exportFileTemplate -f $batchNum) -Encoding UTF8 -NoTypeInformation
$row += $batchSize
$batchNum++
}
$row
and $batchNum
could be rolled into one variable, technically, but this is a bit more readable, IMO.
I'm sure you could also write this with Select-Object
and Group-Object
, but that's going to be fairly complicated compared to the above and Group-Object
isn't entirely known for it's performance prior to PowerShell 6.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | mklement0 |