'Apply multiple functions with map

I have 2D data that I want to apply multiple functions to. The actual code uses xlrd and an .xlsx file, but I'll provide the following boiler-plate so the output is easy to reproduce.

class Data:
    def __init__(self, value):
        self.value = value

class Sheet:
    def __init__(self, data):
        self.data = [[Data(value) for value in row.split(',')] for row in data.split('\n')]
        self.ncols = max(len(row) for row in self.data)

    def col(self, index):
        return [row[index] for row in self.data]

Creating a Sheet:

fake_data = '''a, b, c,
               1, 2, 3, 4
               e, f, g, 
               5, 6, i, 
                , 6,  , 
                ,  ,  ,  '''

sheet = Sheet(fake_data)

In this object, data contains a 2D array of strings (per the input format) and I want to perform operations on the columns of this object. Nothing up to this point is in my control.

I want to do three things to this structure: transpose the rows into columns, extract value from each Data object, and try to convert the value to a float. If the value isn't a float, it should be converted to a str with stripped white-space.

from operators import attrgetter

# helper function
def parse_value(value):
    try:
        return float(value)
    except ValueError:
        return str(value).strip()

# transpose
raw_cols = map(sheet.col, range(sheet.ncols))

# extract values
value_cols = (map(attrgetter('value'), col) for col in raw_cols)

# convert values
typed_cols = (map(parse_value, col) for col in value_cols)

# ['a', 1.0, 'e', 5.0, '',  '']
# ['b', 2.0, 'f', 6.0, 6.0, '']
# ['c', 3.0, 'g', 'i', '',  '']
# ['',  4.0, '',  '',  '',  '']

It can be seen that map is applied to each column twice. In other circumstances, I want to apply a function to each column more than two times.

Is there are better way to map multiple functions to the entries of an iterable? More over, is there away to avoid the generator comprehension and directly apply the mapping to each inner-iterable? Or, is there a better and extensible way to approach this all together?

Note that this question is not specific to xlrd, it is only the current use-case.



Solution 1:[1]

It appears that the most simple solution is to roll your own function that will apply multiple functions to the same iterable.

def map_many(iterable, function, *other):
    if other:
        return map_many(map(function, iterable), *other)
    return map(function, iterable)

The downside here is that the usage is reversed from map(function, iterable) and it would be awkward to extend map to accept arguments (like it can in Python 3.X).

Usage:

map_many([0, 1, 2, 3, 4], str, lambda s: s + '0', int)
# [0, 10, 20, 30, 40]

Solution 2:[2]

You can easily club the last two map calls using a lambda,

typed_cols = (map(lambda element:parse_value(element['value']), col)
              for col in value_cols)

While you can similar stick in parsing and extracting inside Sheet.col , IMO that would affect the readability of the code.

Solution 3:[3]

As Jared Goguen noticed, the downside of his map_many is a reversed order. This has a regular order:

def map_many(function, *other, iterable=None):
    if iterable is None:
        *other, iterable = other
    if other:
        return map_many(*other, map(function, iterable))
    return map(function, iterable)

Another more function-style way:

def map_many(*funcs_with_iterable):
    *functions, iterable = funcs_with_iterable
    return reduce(lambda x, y: map(y, x), functions, iterable)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Anurag Peshne
Solution 3