'How to compose two calls to Regex::replace_all?
Regex::replace_all
has the signature fn (text: &str) -> Cow<str>
. How would two calls to this be written, f(g(x))
, giving the same signature?
Here's some code I'm trying to write. This has the two calls separated out into two functions, but I couldn't get it working in one function either. Here's my lib.rs
in a fresh Cargo project:
#![allow(dead_code)]
/// Plaintext and HTML manipulation.
use lazy_static::lazy_static;
use regex::Regex;
use std::borrow::Cow;
lazy_static! {
static ref DOUBLE_QUOTED_TEXT: Regex = Regex::new(r#""(?P<content>[^"]+)""#).unwrap();
static ref SINGLE_QUOTE: Regex = Regex::new(r"'").unwrap();
}
fn add_typography(text: &str) -> Cow<str> {
add_double_quotes(&add_single_quotes(text)) // Error! "returns a value referencing data owned by the current function"
}
fn add_double_quotes(text: &str) -> Cow<str> {
DOUBLE_QUOTED_TEXT.replace_all(text, "“$content”")
}
fn add_single_quotes(text: &str) -> Cow<str> {
SINGLE_QUOTE.replace_all(text, "’")
}
#[cfg(test)]
mod tests {
use crate::{add_typography};
#[test]
fn converts_to_double_quotes() {
assert_eq!(add_typography(r#""Hello""#), "“Hello”");
}
#[test]
fn converts_a_single_quote() {
assert_eq!(add_typography("Today's Menu"), "Today’s Menu");
}
}
Here's the best I could come up with, but this will get ugly fast when chaining three or four functions:
fn add_typography(input: &str) -> Cow<str> {
match add_single_quotes(input) {
Cow::Owned(output) => add_double_quotes(&output).into_owned().into(),
_ => add_double_quotes(input),
}
}
Solution 1:[1]
A Cow
contains maybe-owned data.
We can infer from what the replace_all
function does that it returns borrowed data only if substitutions did not happen, otherwise it has to return new, owned data.
The problem arises when the inner call makes a substitution but the outer one does not. In that case, the outer call will simply pass its input through as Cow::Borrowed
, but it borrows from the Cow::Owned
value returned by the inner call, whose data now belongs to a Cow
temporary that is local to add_typography()
. The function would therefore return a Cow::Borrowed
, but would borrow from the temporary, and that's obviously not memory-safe.
Basically, this function will only ever return borrowed data when no substitutions were made by either call. What we need is a helper that can propagate owned-ness through the call layers whenever the returned Cow
is itself owned.
We can construct a .map()
extension method on top of Cow
that does exactly this:
use std::borrow::{Borrow, Cow};
trait CowMapExt<'a, B>
where B: 'a + ToOwned + ?Sized
{
fn map<F>(self, f: F) -> Self
where F: for <'b> FnOnce(&'b B) -> Cow<'b, B>;
}
impl<'a, B> CowMapExt<'a, B> for Cow<'a, B>
where B: 'a + ToOwned + ?Sized
{
fn map<F>(self, f: F) -> Self
where F: for <'b> FnOnce(&'b B) -> Cow<'b, B>
{
match self {
Cow::Borrowed(v) => f(v),
Cow::Owned(v) => Cow::Owned(f(v.borrow()).into_owned()),
}
}
}
Now your call site can stay nice and clean:
fn add_typography(text: &str) -> Cow<str> {
add_single_quotes(text).map(add_double_quotes)
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |