'Perl: break down a string, with some unique constraints
I'm using Perl to feed data to an LCD display. The display is 8 characters wide. The strings of data to be displayed are always significantly longer than 8 characters. As such, I need to break the strings down into "frames" of 8 characters or less, and feed the "frames" to the display one at a time.
The display is not intelligent enough to do this on its own. The only convenience it offers is that strings of less than 8 characters are automatically centered on the display.
In the beginning, I simply sent the string 8 characters at a time - here goes 1-8, now 9-16, now 17-24, etc. But that wasn't especially nice-looking. I'd like to do something better, but I'm not sure how best to approach it.
These are the constraints I'd like to implement:
- Fit as many words into a "frame" as possible
- No starting/trailing space(s) in a "frame"
- Symbol (ie. hyphen, ampersand, etc) with a space on both sides qualifies as a word
- If a word is longer than 8 characters, simulate per-character scrolling
- Break words longer than 8 characters at a slash or hyphen
Some hypothetical input strings, and desired output for each...
Electric Light Orchestra - Sweet Talkin' Woman
Electric
Light
Orchestr
rchestra
- Sweet
Talkin'
Woman
Quarterflash - Harden My Heart
Quarterf
uarterfl
arterfla
rterflas
terflash
- Harden
My Heart
Steve Miller Band - Fly Like An Eagle
Steve
Miller
Band -
Fly Like
An Eagle
Hall & Oates - Did It In A Minute
Hall &
Oates -
Did It
In A
Minute
Bachman-Turner Overdrive - You Ain't Seen Nothing Yet
Bachman-
Turner
Overdriv
verdrive
- You
Ain't
Seen
Nothing
Yet
Being a relative Perl newbie, I'm trying to picture how would be best to handle this. Certainly I could split
the string into an array of individual words. From there, perhaps I could loop through the array, counting the letters in each subsequent word to build the 8-character "frames". Upon encountering a word longer than 8 characters, I could then repetitively call substr
on that word (with offset +1 each time), creating the illusion of scrolling.
Is this a reasonable way to accomplish my goal? Or am I reinventing the wheel here? How would you do it?
Solution 1:[1]
The base question is to find all consecutive overlapping N-long substrings in a compact way.
Here it is in one pass with a regex, and see the end for doing it using substr
.
my $str = join '', "a".."k"; # 'Quarterflash';
my @eights = $str =~ /(?=(.{8}))/g;
This uses a lookahead which also captures, and in this way the regex crawls up the string character by character, capturing the "next" eight each time.
Once we are at it, here is also a basic solution for the problem. Add words to a buffer until it would exceed 8 characters, at which point it is added to an array of display-ready strings and cleared.
use warnings;
use strict;
use feature 'say';
my $str = shift // "Quarterflash - Harden My Heart";
my @words = split ' ', $str;
my @to_display;
my $buf = '';
foreach my $w (@words) {
if (length $w > 8) {
# Now have to process the buffer first then deal with this long word
push @to_display, $buf;
$buf = '';
push @to_display, $w =~ /(?=(.{8}))/g;
}
elsif ( length($buf) + 1 + length($w) > 8 ) {
push @to_display, $buf;
$buf = $w;
}
elsif (length $buf != 0) { $buf .= ' ' . $w }
else { $buf = $w }
}
push @to_display, $buf if $buf;
say for @to_display;
This is clearly missing some special/edge cases, in particular those involving non-word characters and hyphenated words, but that shouldn't be too difficult to add.†
Here is a way to get all consecutive 8-long substrings using substr
my @to_display = map { substr $str, $_, 8 } 0..length($str)-8;
† Example, break a word with hyphen/slash when it has no spaces around it (per question)
my @parts = split m{\s+|(?<=\S)[-/](?=\S)}, $w;
The hyphen/slash is discarded as this stands; that can be changed by capturing the pattern as well and then filtering out elements with only spaces
my @parts = grep { /\S/ } split m{( \s+ | (?<=\S) [-/] (?=\S) )}x, $w;
These haven't been tested beyond just barely. Can fit in the if (length $w > 8)
branch.
The initial take-- The regex was originally written with a two-part pattern. Keeping it here for record and as an example of use of pair-handling functions from List::Util
The regex below matches and captures a character, followed by a lookahead for the next seven, which it also captures. This way the engine captures 1 and 7-long substrings as it moves along char by char. Then the consecutive pairs from the returned list are joined
my $str = join '', "a".."k"; # 'Quarterflash';
use List::Util qw(pairmap);
my @eights = pairmap { $a . $b } $str =~ /(. (?=(.{7})) )/gx;
# or
# use List::Util qw(pairs);
# my @eights = map { join '', @$_ } pairs $str =~ /(.(?=(.{7})))/g;
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |