Slow code execution - flipping a coin a billion times

Question

Slow code execution - flipping a coin a billion times

Such, for example, code for only a billion iterations runs for an hour (60 minutes) - what am I doing wrong?

import random


reshka = 0
orel = 0
i = 0
while i < 1000000000: 
    coin = random.randint(0, 1)
    if coin > 0: 
        reshka += 1
    else:
        orel += 1
    i += 1
print('reshek', reshka, 'orlov', orel)

input()

11

python производительность

Author: jfs, 2015-07-01

Source

4 answers

Please pay attention to the answer @jfs, its solution is more effective than mine.

To find out what exactly is slowing down your program, you can use a profiler. It comes with Python, and you can read the documentation here. Let's see what's wrong with your program:

python -m cProfile -s time test.py

To speed up the check, I reduced the number of iterations to 1000000. The profiler will output something like this:

     6002902 function calls (6002874 primitive calls) in 3.848 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1000000    1.162    0.000    2.728    0.000 random.py:165(randrange)
  1000000    0.992    0.000    1.566    0.000 random.py:216(_randbelow)
        1    0.694    0.694    3.848    3.848 test.py:1(<module>)
  2002055    0.518    0.000    0.518    0.000 {method 'getrandbits' of '_random.Random' objects}
  1000000    0.422    0.000    3.150    0.000 random.py:210(randint)
  1000000    0.056    0.000    0.056    0.000 {method 'bit_length' of 'int' objects}
        1    0.001    0.001    0.001    0.001 {built-in method load_dynamic}
       16    0.001    0.000    0.001    0.000 {built-in method stat}
        1    0.000    0.000    0.001    0.001 {built-in method print}
        2    0.000    0.000    0.000    0.000 {built-in method loads}
      ...

As expected, the generation of a random number by the randint method consumes the most time. As you can see from the profiler output, it takes a long time to check for the occurrence of a number in the interval. Also, the number of calls getrandbits looks strange - 2 times more than necessary. You can speed up the generation of a random integer from 0 to 1 by immediately using getrandbits:

coin = random.getrandbits(1)

This way we get rid of unnecessary calculations and get a noticeable performance increase:

         1000847 function calls (1000819 primitive calls) in 0.837 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.440    0.440    0.837    0.837 test.py:1(<module>)
  1000000    0.237    0.000    0.237    0.000 {method 'getrandbits' of '_random.Random' objects}
        2    0.093    0.046    0.094    0.047 <frozen importlib._bootstrap>:1031(get_data)
        1    0.062    0.062    0.062    0.062 {built-in method load_dynamic}
        2    0.001    0.001    0.001    0.001 {built-in method init_builtin}
        2    0.001    0.001    0.001    0.001 {method 'read' of '_io.FileIO' objects}
       16    0.001    0.000    0.001    0.000 {built-in method stat}
        1    0.001    0.001    0.001    0.001 {built-in method print}
        2    0.000    0.000    0.000    0.000 {built-in method loads}
 ...

Vroch, with a billion iterations of the program will still run slowly. Initially, I assumed that the performance depends on the Python random number generator, and rewritten the program so that the generator is called from the cish library:

from ctypes import cdll
libc = cdll.msvcrt

reshka = 0
orel = 0

for i in range(1000000):
    coin = libc.rand() % 2
    if coin > 0: 
        reshka += 1
    else:
        orel += 1

But it didn't give any performance boost:

         1753 function calls (1703 primitive calls) in 0.753 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.744    0.744    0.753    0.753 test.py:1(<module>)
        2    0.002    0.001    0.002    0.001 {built-in method load_dynamic}
       32    0.002    0.000    0.002    0.000 {built-in method stat}
       37    0.001    0.000    0.001    0.000 {built-in method __build_class__}
        1    0.001    0.001    0.001    0.001 {built-in method print}
        4    0.001    0.000    0.001    0.000 {built-in method loads}
        4    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:1031(get_data)
       12    0.000    0.000    0.000    0.000 {built-in method OpenKey}
       19    0.000    0.000    0.002    0.000 <frozen importlib._bootstrap>:1352(find_loader)
       71    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:74(_path_join)
        1    0.000    0.000    0.003    0.003 __init__.py:1(<module>)
      ...

Then I sketched a similar program in C and measured its speed:

#include <stdio.h>
#include <math.h>
#include <stdlib.h>

int main() {
    int heads = 0, tails = 0, i = 0;
    srand(0);
    for (i = 0; i < 1000000; i++) {
        if (rand() %2) {
            heads++;
        } else {
            tails++;
        }
    }

    printf("Heads: %d Tails: %d", heads, tails);
    return 0;
}

It took 0.045 seconds to complete. With a billion iterations - 16.98 seconds. From here The conclusion is that Python has too much overhead in interpreting the program. It doesn't matter how fast critical sections of the program are executed - you still get a decent slowdown just because Python is an interpreted language. Perhaps code execution can be sped up by using Cython or PyPy, but is it necessary? If so, read this article, which provides introductory information on optimizing Python programs.

22

Author: fori1ton, 2017-04-13 12:53:29

You can use the NumPy module to generate random numbers and thus reduce the time to about 20 seconds without using additional compilers.

import numpy, time

def test(n):
    t1 = time.time()
    n2 = n / (10**7) # если нет сотни гигабайт оперативки, то придется считать частями
    reshka = 0
    orel = 0
    while n2 > 0:
        rnd = numpy.random.randint(0, 2, 10**7)
        nonzero = numpy.count_nonzero( rnd )
        orel += nonzero # орел - 1 , то есть не ноль
        reshka += 10**7 - nonzero
        n2 -= 1
    t2 = time.time()
    print(reshka, orel, t2 - t1)

test(10**9)

Result:

500004947 499995053 19.711090803146362

That is, 20 seconds without any special tricks, except for NumPy. In this case, the execution time is comparable to the result in C, in the response @for1ton

6

Author: ReinRaus, 2020-08-07 16:06:31

An interesting task. You can try not to use loops and use only one call to random. In 5 minutes, I sketched something like this:

from random import randrange
from math import sqrt

if __name__ == '__main__':
    global ans
    ans = None

    n = 10**9
    r = randrange(0, n * 2 + 1)

    if r < n:
        ans = n // 2 - sqrt(r)
    else:
        ans = n // 2 + sqrt(r - n // 2)

    print(int(ans), n - int(ans))

The idea is that by flipping a coin a billion times, we get a random number from one to a billion. The closer the number is to half a billion, the greater the chance of getting it.

I'm not sure that I realized everything exactly from the side of mathematics.

0

Author: rew, 2020-08-07 18:34:38

score 18 · Accepted Answer

You can run unchanged on Pypy, which has a JIT compiler (51 seconds):

$ /usr/bin/time pypy .
('reshek', 499987397, 'orlov', 500012603)
50.67user 0.00system 0:50.71elapsed 99%CPU (0avgtext+0avgdata 37004maxresident)k
0inputs+0outputs (0major+3409minor)pagefaults 0swaps

Or rewrite the code to request the entire billion at once on a regular implementation (with Python):

#!/usr/bin/env python
import random

N = 10**9
n = random.getrandbits(N)
popcount = bin(n).count("1")
print("heads: %d tails: %d" % (popcount, (N - popcount)))

This option is noticeably faster (8 seconds):

$ /usr/bin/time python heads-n-tails.py
heads: 499999280 tails: 500000720
7.26user 0.13system 0:07.40elapsed 99%CPU (0avgtext+0avgdata 1115464maxresident)k
0inputs+0outputs (0major+3053minor)pagefaults 0swaps