About the processing speed of list generators in Python
Faced with a huge difference in performance depending on how the list generator is passed. I'm trying to figure out the reason.
import timeit
print('sum([ ]) - ', timeit.timeit('sum([x*2 for x in range(1000)])', number=1000))
print('sum(( )) - ', timeit.timeit('sum((x*2 for x in range(1000)))', number=1000))
print('sum( ) - ', timeit.timeit('sum(x*2 for x in range(1000))', number=1000))
print('sum(list( )) - ', timeit.timeit('sum(list(x*2 for x in range(1000)))', number=1000))
Issues:
sum([ ]) - 0.1341158
sum(( )) - 1.3132869999999999
sum( ) - 1.3040761
sum(list( )) - 1.3090227000000003
Why does the [] option work so much faster? Shouldn't it give about the same performance as list()? The generator in parentheses and without parentheses-it is clear that they are considered the same, but why does list () give the same speed and why is there such a difference with []
Other example
print('any([ ]) - ', timeit.timeit('any([x*2 for x in range(1000)])', number=1000))
print('any(( )) - ', timeit.timeit('any((x*2 for x in range(1000)))', number=1000))
print('any( ) - ', timeit.timeit('any(x*2 for x in range(1000))', number=1000))
print('any(list( )) - ', timeit.timeit('any(list(x*2 for x in range(1000)))', number=1000))
Here the situation is reversed:
any([ ]) - 0.11409800000000025
any(( )) - 0.005497499999999711
any( ) - 0.005274499999999627
any(list( )) - 1.3024014
The difference partly stems from the "laziness" of the any function, and in the case of a pure generator (#2 and #3), it returns True on the first number. But why do [] and list() have such a big difference with each other? Is the list created differently in these cases?
Perhaps I'm missing something simple, I recently switched to Python. And if anyone came across sensible articles on this topic, then please share.
[ADDED LATER]
As it turned out, the results on other platforms can be quite different. For example, I ran the script on Google Colab and got this (the number of repetitions took more for clarity):
sum([ ]) - 1.04
sum(( )) - 0.78
sum( ) - 0.79
sum(list( )) - 1.27
any([ ]) - 9.84
any(( )) - 0.000072
any( ) - 0.000070
any(list( )) - 12.08
That is, colaba gets a more predictable result. Actually, I roughly expected to get it when I was just starting all this testing, so I was so surprised that it didn't work out.
The main question - why do I have on the locale (Python 3.8.1 under Windows 10) occurs the difference is an order of magnitude between any([ ]) and any (list ( )) and between sum ([]) and the other options.