'How to get regex to work in a perl script?
I am working on a Linux based Debian environnement (precisely a Proxmox server) and I am writing a perl script.
My problem is : I have a folder with some files in it, every files in this folder have a number as a name (exemple : 100, 501, 102...). The lowest number possible is 100 and there is no limit for the greatest.
I want my script to only return files whose name is between 100 and 500. So, I write this :
system(ls /the/path/to/my/files | grep -E "^[1-4][0-9]{2}|5[0]{2}");
I think my regex and the command are good because when I type this into a terminal, this is working. But soon as I execute my script, I have those errors messages :
String found where operator expected at backupsrvproxmox.pl line 3, near "E "^[1-4][0-9]{2}|5[0]{2}""
(Do you need to predeclare E?)
Unknown regexp modifier "/b" at backupsrvproxmox.pl line 3, at end of line
syntax error at backupsrvproxmox.pl line 3, near "E "^[1-4][0-9]{2}|5[0]{2}""
Execution of backupsrvproxmox.pl aborted due to compilation errors.
I also tried with egrep but still not working.
I don't understand why the error message is about the /b modifier since I only use integer and no string.
So, any help would be good !
Solution 1:[1]
Instead of using system tools via system
can very nicely do it all in your program
my @files = grep {
my ($n) = m{.*/([0-9]+)}; #/
defined $n and $n >= 100 and $n <= 500;
}
glob "/the/path/to/my/files/*"
This assumes that numbers in file names are at the beginning of the filename, picked up from the quesiton, so the subpattern for the filename itself directly follows a /
. †
(That "comment" #/
on the right is there merely to turn off wrong and confusing syntax highlighting in the editor.)
The command you tried didn't work because of the wrong syntax, since system takes either a string or a list of strings while you give it a bunch of "bareword"s, what confused the interpreter to emit a complex error message (most of the time perl's error messages are right to the point).
But there is no need to suffer through syntax details, which can get rather involved for this, nor with shell invocations which are complex and messy (under the hood), and inefficient.
† It also assumes that the files are not in the current directory -- clearly, since a path is passed to glob
(and not just *
for files in the current directory), which returns the filename with the path, and which is why we need the .*/
to greedily get to the last /
before matching the filename.
But if we are in the current directory that won't work since there wouldd be no /
in the filename. To include this possibility the regex need be modified, for example like
my ($n) = m{ (?: .*/ | ^) ([0-9]+)}x;
This matches filenames beginning with a number, either after the last slash in the path (with .*/
subpattern) or at the beginning of the string (with ^
anchor).
The modifier /x
makes it discard literal spaces in the pattern so we can use them freely (along with newlines and #
for comments!) to make that mess presumably more readable. Then I also use {}
for delimiters so to not have to escape the /
in the pattern (and with any delimiters other than //
we must have that m
).
Solution 2:[2]
Using a regular expression to try to match a range of numbers is just a pain. And this is perl; no need to shell out to external programs to get a list of files (Generally also a bad idea in shell scripts; see Why you shouldn't parse the output of ls(1)
)!
#!/usr/bin/env perl
use strict;
use warnings;
use feature qw/say/;
sub getfiles {
my $directory = shift;
opendir my $dir, $directory or die "Unable to open $directory: $!";
my @files =
grep { /^\d+$/ && $_ >= 100 && $_ <= 500 } readdir $dir;
closedir $dir;
return @files;
}
my @files = getfiles '/the/path/to/my/files/';
say "@files";
Or using the useful Path::Tiny
module:
#!/usr/bin/env perl
use strict;
use warnings;
use feature qw/say/;
use Path::Tiny;
# Returns a list of Path::Tiny objects, not just names.
sub getfiles {
my $dir = path($_[0]);
return grep { $_ >= 100 && $_ <= 500 } $dir->children(qr/^\d+$/);
}
my @files = getfiles '/the/path/to/my/files/';
say "@files";
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 |