'Trying to style large number of rows with openpyxl, is there a 'faster' way?

I've got a function I use to check and style rows of a spreadsheet based on matching criteria with openpyxl. While this has been functional for nearly every smaller sheet as I've graduated to using large sets of rows the speed has slowed considerably.

I'm struggling to find ways to speed it up and was hoping for some help figuring out what I'm missing.

I believe my problem is that I am looping through every row, but then looping through each cell of the target row to match a specific column value. Though I am not sure what I can do to get around this issue (Shown below):

    for row in sheet.iter_rows():
        for cell in row:
            cell.alignment = Alignment(wrapText=True)
            if row[type_row].value in group1:

Entire code of the function and another function I use to actually format the cells:

import openpyxl as opxl
from openpyxl.styles import Font, Border, Side, PatternFill, Alignment, NamedStyle

def format_cell(cell, fill, font, border):
    cell.fill = fill
    cell.font = font
    cell.border = border

def set_style_base(sheet, type_row, col_for_other, col_for_step):
    group1 = ['Temp1', 'Temp2', 'Temp3']
    group2 = ['ACTIVITY']
    group3 = ['ACTIVITY2']
    group4 = ['OPEN']
    group5 = ['BREAK']
    group6 = ['WARNING']
    group7 = ['DIAGNOSTIC']
    pass_start_fill = PatternFill(start_color='00FF00', end_color='00FF00', fill_type='solid')
    pass_end_fill = PatternFill(start_color='99FF99', end_color='99FF99', fill_type='solid')
    error_start_fill = PatternFill(start_color='FF0000', end_color='FF0000', fill_type='solid')
    error_end_fill = PatternFill(start_color='FF6666', end_color='FF6666', fill_type='solid')
    fail_start_fill = PatternFill(start_color='FF8000', end_color='FF8000', fill_type='solid')
    fail_end_fill = PatternFill(start_color='FFB266', end_color='FFB266', fill_type='solid')
    group2_fill = PatternFill(start_color='FFA500', end_color='FFA500', fill_type='solid')
    group3_fill = PatternFill(start_color='000000', end_color='000000', fill_type='solid')
    group4_fill = PatternFill(start_color='00BFFF', end_color='00BFFF', fill_type='solid')
    group5_fill = PatternFill(start_color='800080', end_color='800080', fill_type='solid')
    group6_fill = PatternFill(start_color='FFFF00', end_color='FFFF00', fill_type='solid')
    thick = Side(border_style="thick")
    border_thick = Border(top=thick, left=thick, right=thick, bottom=thick)
    bold_font = Font(bold=True)
    def_font = Font(bold=False)
    for row in sheet.iter_rows():
        for cell in row:
            cell.alignment = Alignment(wrapText=True)
            if row[type_row].value in group1:
                if row[col_for_step].data_type == "s":
                    if 'START:' in row[col_for_step].value:
                        if row[col_for_other].data_type == "s":
                            if row[col_for_other].value == 'PASS':
                                format_cell(cell, pass_start_fill, bold_font, border_thick)
                            if row[col_for_other].value == 'FAIL':
                                format_cell(cell, fail_start_fill, bold_font, border_thick)
                            if row[col_for_other].value == 'ERROR':
                                format_cell(cell, error_start_fill, bold_font, border_thick)
                        if 'END:' in row[col_for_step].value:
                            if row[col_for_other].data_type == "s":
                                if row[col_for_other].value == 'PASS':
                                    format_cell(cell, pass_end_fill, bold_font, border_thick)
                                if row[col_for_other].value == 'FAIL':
                                    format_cell(cell, fail_end_fill, bold_font, border_thick)
                                if row[col_for_other].value == 'ERROR':
                                    format_cell(cell, error_end_fill, bold_font, border_thick)
            if row[type_row].value in group2:
                format_cell(cell, group2_fill, bold_font, border_thick)
            if row[type_row].value in group3:
                format_cell(cell, group3_fill, def_font, border_thick)
                cell.font = Font(color = 'FFA500') #Font color needs to be set
            if row[type_row].value in group4:
                format_cell(cell, group4_fill, bold_font, border_thick)
            if row[type_row].value in group5:
                format_cell(cell, group5_fill, bold_font, border_thick)
            if row[type_row].value in group6:
                format_cell(cell, group6_fill, def_font, border_thick)
                cell.font = Font(color = '000000') #Font color needs to be set
            if row[type_row].value in group7: #special case as 'DIAGNOSTIC' has no fill
                cell.border = Border(top=thick, left=thick, right=thick, bottom=thick)
                cell.font = Font(bold=True)
            if row[type_row].value not in [group1, group2, group3, group4, group5, group6, group7]:
                if row[col_for_step].data_type == "s":
                    if 'START:' in row[col_for_step].value:
                        if row[col_for_other].data_type == "s":
                                if row[col_for_other].value == 'PASS':
                                    format_cell(cell, pass_start_fill, bold_font, border_thick)
                                if row[col_for_other].value == 'FAIL':
                                    format_cell(cell, fail_start_fill, bold_font, border_thick)
                                if row[col_for_other].value == 'ERROR':
                                    format_cell(cell, error_start_fill, bold_font, border_thick)
                    if 'END:' in row[col_for_step].value:
                        if row[col_for_other].data_type == "s":
                            if row[col_for_other].value == 'PASS':
                                format_cell(cell, pass_end_fill, bold_font, border_thick)
                            if row[col_for_other].value == 'FAIL':
                                format_cell(cell, fail_end_fill, bold_font, border_thick)
                            if row[col_for_other].value == 'ERROR':
                                format_cell(cell, error_end_fill, bold_font, border_thick)

Any direction or suggestions would be amazing. Thanks



Solution 1:[1]

VBA isn't known for its speed either. It looks to me like there will be a problem processing the same cell multiple times but the structure is unclear. Dispatching would help here a bit. It looks like you only touch two cells per row, if that's the case I'd use those as min_col and max_col.

This would give you something like this.

conditions = {"PASS": {"start": pass_start_fill, "end": pass_end_fill}, "FAIL": {"start": fail_start_fill, "end", fail_end_fill}

for row in ws.iter_rows(min_col=col_for_step, max_col=col_for_other):
   step, other = row[0], row[-1]
   if step.value == "START":
      format_cell(other, fill=conditions[other.value]["start"], font=bold_font, border=border_thick])
   elif step.value == "END":
      format_cell(other, fill=conditions[other.value]["end"], font=bold_font, border=border_thick])

This needs extending but if you can simplify the code, it should be easier to identify bottlenecks.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Charlie Clark