'Perl: break down a string, with some unique constraints

I'm using Perl to feed data to an LCD display. The display is 8 characters wide. The strings of data to be displayed are always significantly longer than 8 characters. As such, I need to break the strings down into "frames" of 8 characters or less, and feed the "frames" to the display one at a time.

The display is not intelligent enough to do this on its own. The only convenience it offers is that strings of less than 8 characters are automatically centered on the display.

In the beginning, I simply sent the string 8 characters at a time - here goes 1-8, now 9-16, now 17-24, etc. But that wasn't especially nice-looking. I'd like to do something better, but I'm not sure how best to approach it.

These are the constraints I'd like to implement:

  • Fit as many words into a "frame" as possible
  • No starting/trailing space(s) in a "frame"
  • Symbol (ie. hyphen, ampersand, etc) with a space on both sides qualifies as a word
  • If a word is longer than 8 characters, simulate per-character scrolling
  • Break words longer than 8 characters at a slash or hyphen

Some hypothetical input strings, and desired output for each...

Electric Light Orchestra - Sweet Talkin' Woman

Electric
Light
Orchestr
rchestra
- Sweet
Talkin'
Woman


Quarterflash - Harden My Heart

Quarterf
uarterfl
arterfla
rterflas
terflash
- Harden
My Heart


Steve Miller Band - Fly Like An Eagle

Steve
Miller
Band -
Fly Like
An Eagle


Hall & Oates - Did It In A Minute

Hall &
Oates -
Did It
In A
Minute


Bachman-Turner Overdrive - You Ain't Seen Nothing Yet

Bachman-
Turner
Overdriv
verdrive
- You
Ain't
Seen
Nothing
Yet

Being a relative Perl newbie, I'm trying to picture how would be best to handle this. Certainly I could split the string into an array of individual words. From there, perhaps I could loop through the array, counting the letters in each subsequent word to build the 8-character "frames". Upon encountering a word longer than 8 characters, I could then repetitively call substr on that word (with offset +1 each time), creating the illusion of scrolling.

Is this a reasonable way to accomplish my goal? Or am I reinventing the wheel here? How would you do it?



Solution 1:[1]

The base question is to find all consecutive overlapping N-long substrings in a compact way.

Here it is in one pass with a regex, and see the end for doing it using substr.

my $str = join '', "a".."k";    # 'Quarterflash';
    
my @eights = $str =~ /(?=(.{8}))/g;

This uses a lookahead which also captures, and in this way the regex crawls up the string character by character, capturing the "next" eight each time.

Once we are at it, here is also a basic solution for the problem. Add words to a buffer until it would exceed 8 characters, at which point it is added to an array of display-ready strings and cleared.

use warnings;
use strict;
use feature 'say';

my $str = shift // "Quarterflash - Harden My Heart";

my @words = split ' ', $str;  

my @to_display; 
my $buf = ''; 

foreach my $w (@words) {      
    if (length $w > 8) {  
        # Now have to process the buffer first then deal with this long word
        push @to_display, $buf;
        $buf = '';
        push @to_display, $w =~ /(?=(.{8}))/g;
    }   
    elsif ( length($buf) + 1 + length($w) > 8 ) { 
        push @to_display, $buf;
        $buf = $w; 
    }
    elsif (length $buf != 0) { $buf .= ' ' . $w }
    else                     { $buf  = $w       }   
}    
push @to_display, $buf if $buf;
    
say for @to_display; 

This is clearly missing some special/edge cases, in particular those involving non-word characters and hyphenated words, but that shouldn't be too difficult to add.

Here is a way to get all consecutive 8-long substrings using substr

my @to_display = map { substr $str, $_, 8 } 0..length($str)-8;

Example, break a word with hyphen/slash when it has no spaces around it (per question)

my @parts = split m{\s+|(?<=\S)[-/](?=\S)}, $w;

The hyphen/slash is discarded as this stands; that can be changed by capturing the pattern as well and then filtering out elements with only spaces

my @parts = grep { /\S/ } split m{( \s+ | (?<=\S) [-/] (?=\S) )}x, $w;

These haven't been tested beyond just barely. Can fit in the if (length $w > 8) branch.


The initial take-- The regex was originally written with a two-part pattern. Keeping it here for record and as an example of use of pair-handling functions from List::Util

The regex below matches and captures a character, followed by a lookahead for the next seven, which it also captures. This way the engine captures 1 and 7-long substrings as it moves along char by char. Then the consecutive pairs from the returned list are joined

my $str = join '', "a".."k";    # 'Quarterflash';

use List::Util qw(pairmap);

my @eights = pairmap { $a . $b } $str =~ /(. (?=(.{7})) )/gx;

# or 
# use List::Util qw(pairs);
# my @eights = map { join '', @$_ } pairs $str =~ /(.(?=(.{7})))/g;

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1