'Squares of a Sorted Array, why sorted() method is faster than O(n) method? [closed]

I am working on leetcode algorithm problem 977. Squares of a Sorted Array.

Why the submissions using built-in method sorted is faster than my o(n) traversal method as below?

The input is a sorted (non-decreasing order) list with integers.

sample 208 ms submission:

class Solution:
    def sortedSquares(self, A: List[int]) -> List[int]:
        return sorted(x*x for x in A)

my 260 ms submission:

class Solution:
    def sortedSquares(self, A: List[int]) -> List[int]:
        start = 0
        end = len(A)-1

        B = []

        while start <= end:
            if abs(A[start]) >= abs(A[end]):
                B.append(A[start] * A[start])
                start += 1
            else:
                B.append(A[end] * A[end])
                end -= 1
        B.reverse()
        return B


Solution 1:[1]

Just because your algorithm has a better worst-case running time doesn't mean it will be better in practice. Python's built-in sort method is highly optimized, so its running time can be cnlg(n) for a relatively small c, while your algorithm, while being O(n), could have a really high constant d for dn. We don't know what your input is, so it could be an array of, say, 10000 elements, for which d is still significantly larger than clg(10000).

Also, as your input list is kind-of sorted (non-negative part), there might be some optimization for nearly-sorted lists in that case (not sure, never looked at Python's sort implementation).

Solution 2:[2]

The time LeetCode reports is the sum over all test cases that your solution was tested against. This is very unreliable as a measure of algorithmic efficiency, because

  • Most of the test cases are on small inputs,
  • The overhead time taken by the judge itself is substantial, and quite variable, so even moderate differences between two measurements could just depend on random variation in this overhead.

See this answer for more discussion.


So if we want to compare the two algorithms' performances, it's better to do it ourselves. I did; here are my times for lists of size up to 5,000 (averaged over 1,000 runs each):

times 1

Here are my times for lists of size up to 1,000,000 (no averaging):

times 2

As you can see, using sorted is consistently faster for both small and large inputs, so LeetCode's comparison is actually correct this time (though perhaps just by coincidence anyway!). Your solution is indeed linear time, and these results are consistent with that. But the solution using sorted also appears to take linear time, and Kelly Bundy's answer explains why it should.


My timing code is below. The times were measured using Python 3.5.2, and I get similar results with Python 3.8.1.

def linear(A):
    start = 0
    end = len(A)-1
    B = []
    while start <= end:
        if abs(A[start]) >= abs(A[end]):
            B.append(A[start] * A[start])
            start += 1
        else:
            B.append(A[end] * A[end])
            end -= 1
    B.reverse()
    return B

def using_sorted(A):
    return sorted(x*x for x in A)

from timeit import timeit
from random import randint

print('n', 'linear', 'using sorted', sep='\t')
for n in range(10000, 1000001, 10000):
    data = sorted(randint(-10*n, 10*n) for i in range(n))
    t1 = timeit(lambda: linear(data), number=1)
    t2 = timeit(lambda: using_sorted(data), number=1)
    print(n, t1, t2, sep='\t')

Solution 3:[3]

You can pre-square everything (so no abs() is needed, and especially no repeated abs() calls for a single element):

C = [x*x for x in A]

start = 0
end = len(A)-1

B = []

while start <= end:
    if C[start] >= C[end]:
        B.append(C[start])
        start += 1
    else:
        B.append(C[end])
        end -= 1
B.reverse()
return B

but I have not measured it. For example in-place pre-squaring may perform better.

Solution 4:[4]

Python's built-in sorted function, which uses Timsort, is also O(n) in this case! Your input numbers are sorted and you square them. Let's look at three cases:

  • If your input only contains positive numbers, then squaring them keeps the sortedness. Timsort detects that in O(n) time and doesn't do anything else.
  • If your input only contains negative numbers, squaring makes it sorted in reverse. For example, [-5, -4, -2, -1] becomes [25, 16, 4, 1]. Timsort detects that and simply reverses the list. Time O(n) again.
  • If your input contains both negative and positive numbers, then after squaring you'll have a descending run followed by an ascending run. Then Timsort detects both as such, reverses the descending run, and does one simple merge of the two runs. Time O(n) again.

So... both the built-in sorted and your own sorting are O(n). But sorted is implemented at the C level, which is faster and has access to shortcuts, giving it a smaller hidden constant.

Solution 5:[5]

This is because Python use Timsort which is kind of a adaptive sorting algorithm based on marge_sort and insertion_sort. and the time complexity is

O(nlogn)

for Sorted() in python.

That is why Sorted() working faster then yours

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 David L.
Solution 2
Solution 3 tevemadar
Solution 4
Solution 5 Shadiqur