'How to improve Merge Sort speed in python

Yes, it is homework, but I ended up doing it in Java just to get it done, but now the python implementation is bothering me. I'm pretty sure I've implemented it correctly, but it takes way longer than it should. On 3 million inputs it take on anywhere from 25 to 32 seconds. I'm assuming it has something to do with the way I'm splicing, and appending to the list. I have the source code here, let me know if you see anything.

def merge_sort(seq):
    if len(seq) == 1:
        return seq
    left = merge_sort(seq[:len(seq) // 2])
    right = merge_sort(seq[len(seq) // 2:])

    return merge(left, right)


def merge(left, right):
    result = []
    left_count = 0
    right_count = 0
    while len(left) > left_count and len(right) > right_count:
        if left[left_count] > right[right_count]:
            result.append(right[right_count])
            right_count += 1
        else:
            result.append(left[left_count])
            left_count += 1

    while len(left) > left_count:
        result.append(left[left_count])
        left_count += 1

    while len(right) > right_count:
        steps += 1
        result.append(right[right_count])
        right_count += 1

    return result


Solution 1:[1]

Using

while True:

instead of

while len(left) > left_count and len(right) > right_count:

makes it about 40-45% faster for me:

def merge(left, right):
    result = []
    left_count = 0
    right_count = 0
    try:
        while True:
            if left[left_count] > right[right_count]:
                result.append(right[right_count])
                right_count += 1
            else:
                result.append(left[left_count])
                left_count += 1
    except:
        return result + left[left_count:] + right[right_count:]

That last line doesn't seem to make it faster, but I like it a lot better.

Solution 2:[2]

From the prior thread linked to by Rishav Kundu:


You can initialise the whole result list in the top level call to mergesort:

result = [0]*len(x)   # replace 0 with a suitable default element if necessary. 
                      # or just copy x (result = x[:])

Then for the recursive calls you can use a helper function to which you pass not sublists, but indices into x. And the bottom level calls read their values from x and write into result directly.


For this to work, the parameter to the seq array needs to be a reference to seq and also the helper array.

You can also add a parameter to keep track of what direction to merge, so that the copy back step is avoided. C example using mtoa flag that means merge from b to a (if false, it means merge a to b). On my system, Intel 2600K 3.4ghz, this code sorts 4 million pseudo random 32 bit unsigned integers in about 0.36 seconds, and 16 million in about 1.6 seconds.

void TopDownMergeSort(int seq[], size_t n)
{
int * b;
    if(n < 2)
        return;
    b = malloc(n * sizeof(seq[0]));
    TopDownSplitMerge(seq, b, 0, n, true);
    free(b);
}

void TopDownSplitMerge(int a[], int b[], size_t ll, size_t ee, bool mtoa)
{
size_t rr;
    if ((ee - ll) == 1){                    // if size == 1
        if(!mtoa)                           //  copy to b if merging a to b
            b[ll] = a[ll];
        return;
    }
    rr = (ll + ee)>>1;                      // midpoint, start of right half
    TopDownSplitMerge(a, b, ll, rr, !mtoa);
    TopDownSplitMerge(a, b, rr, ee, !mtoa);
    if(mtoa)                                // if merging to a, merge b to a
        Merge(b, a, ll, rr, ee);
    else                                    // else merge a to b
        Merge(a, b, ll, rr, ee);
}

Another option would be to use bottom up merge sort, which skips the recursion steps and just starts merging even runs with odd runs, with an initial run size of 1.

Solution 3:[3]

I think you are correct. Slicing creates a new list containing the elements sliced. This is necessarily a costly operation.

In Java, there is no general slicing feature. However if you used List.subList that would return a view of the original instead of a copy and I would presume be much faster. In-place array manipulation, would be faster still.

Solution 4:[4]

This is 2.7x faster than the op's code and 2x faster than @Stefan Pochmann 's

def merge(left, right):
    result = []
    left_count = 0
    right_count = 0

    try:
        while True:
            result.append(right[right_count] if left[left_count] > right[right_count] else left[left_count])
            right_count += left[left_count] > right[right_count]
            left_count += left[left_count] <= right[right_count]
    except:
        return result + (left[left_count:] if len(left) > left_count else right[right_count:])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Stefan Pochmann
Solution 2
Solution 3 kindall
Solution 4