Why is taking the mod of a number in python faster with exponents?

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------

Take control of your privacy with Proton's trusted, Swiss-based, secure services.
Choose what you need and safeguard your digital life:
Mail: https://go.getproton.me/SH1CU
VPN: https://go.getproton.me/SH1DI
Password Manager: https://go.getproton.me/SH1DJ
Drive: https://go.getproton.me/SH1CT

Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game 5 Looping

--

Chapters
00:00 Why Is Taking The Mod Of A Number In Python Faster With Exponents?
02:08 Accepted Answer Score 20
03:34 Answer 2 Score 6
04:54 Answer 3 Score 5
05:25 Thank you

--

Full question
https://stackoverflow.com/questions/1055...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #optimization #profiling

#avk47

ACCEPTED ANSWER

Score 20

There is no difference in the generated bytecode, because the compiler does its job well and optimizes away the constant arithmetic expression. That means your test results are just a coincidence (try timing the functions in a different order!).

>>> import dis
>>> dis.dis(test1)
  2           0 SETUP_LOOP              30 (to 33)
              3 LOAD_GLOBAL              0 (xrange)
              6 LOAD_GLOBAL              1 (AMOUNT)
              9 CALL_FUNCTION            1
             12 GET_ITER            
        >>   13 FOR_ITER                16 (to 32)
             16 STORE_FAST               0 (i)

  3          19 LOAD_FAST                0 (i)
             22 LOAD_CONST               1 (65536)
             25 BINARY_MODULO       
             26 STORE_FAST               1 (value)
             29 JUMP_ABSOLUTE           13
        >>   32 POP_BLOCK           

  4     >>   33 LOAD_CONST               0 (None)
             36 RETURN_VALUE        
>>> dis.dis(test5)
  2           0 SETUP_LOOP              30 (to 33)
              3 LOAD_GLOBAL              0 (xrange)
              6 LOAD_GLOBAL              1 (AMOUNT)
              9 CALL_FUNCTION            1
             12 GET_ITER            
        >>   13 FOR_ITER                16 (to 32)
             16 STORE_FAST               0 (i)

  3          19 LOAD_FAST                0 (i)
             22 LOAD_CONST               3 (65536)
             25 BINARY_MODULO       
             26 STORE_FAST               1 (value)
             29 JUMP_ABSOLUTE           13
        >>   32 POP_BLOCK           

  4     >>   33 LOAD_CONST               0 (None)
             36 RETURN_VALUE

(well actually there is a difference: The number is stored at different offsets in the constant table. I can't imagine this causing any difference, though).

For completeness, here's a proper test that uses the timeit module:

import timeit

setup = "i = 1337"

best1 = best2 = float("inf")
for _ in range(5000):
  best1 = min(best1, timeit.timeit("i % 65536", setup=setup, number=10000))
for _ in range(5000):
  best2 = min(best2, timeit.timeit("i % (2**16)", setup=setup, number=10000))
print best1
print best2

Note that I am measuring the minimum time needed, rather than the average. If it takes longer for some reason, this just means that it was interrupted more often (because the code doesn't depend on anything but the power of your CPU).

ANSWER 2

Score 6

Hmmm, using dis to show the python bytes codes shows that the functions are identical. Python has optimised the constant (as expected). So I suspect the time differences are caching effects. The timings on my laptop bear this out (using Python 2.7.3 64 bit on linux)

Fri May 11 23:37:49 2012    results

     8 function calls in 38.825 seconds

Ordered by: call count, name/file/line

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
     1    0.000    0.000   38.825   38.825 <string>:1(<module>)
     1    0.000    0.000   38.825   38.825 z.py:31(run_tests)
     1    7.880    7.880    7.880    7.880 z.py:6(test1)
     1    7.658    7.658    7.658    7.658 z.py:11(test2)
     1    7.806    7.806    7.806    7.806 z.py:16(test3)
     1    7.784    7.784    7.784    7.784 z.py:21(test4)
     1    7.697    7.697    7.697    7.697 z.py:26(test5)

All pretty much identical

>>> from dis import dis
>>> def test1():
...     for i in xrange(AMOUNT):
...         value = i % 65536
...     return
... 
>>> def test5():
...     for i in xrange(AMOUNT):
...         value = i % (2**16)
...     return
... 
>>> dis(test1)
  2           0 SETUP_LOOP              30 (to 33)
              3 LOAD_GLOBAL              0 (xrange)
              6 LOAD_GLOBAL              1 (AMOUNT)
              9 CALL_FUNCTION            1
             12 GET_ITER            
        >>   13 FOR_ITER                16 (to 32)
             16 STORE_FAST               0 (i)

  3          19 LOAD_FAST                0 (i)
             22 LOAD_CONST               1 (65536)
             25 BINARY_MODULO       
             26 STORE_FAST               1 (value)
             29 JUMP_ABSOLUTE           13
        >>   32 POP_BLOCK           

  4     >>   33 LOAD_CONST               0 (None)
             36 RETURN_VALUE        
>>> dis(test5)
  2           0 SETUP_LOOP              30 (to 33)
              3 LOAD_GLOBAL              0 (xrange)
              6 LOAD_GLOBAL              1 (AMOUNT)
              9 CALL_FUNCTION            1
             12 GET_ITER            
        >>   13 FOR_ITER                16 (to 32)
             16 STORE_FAST               0 (i)

  3          19 LOAD_FAST                0 (i)
             22 LOAD_CONST               3 (65536)
             25 BINARY_MODULO       
             26 STORE_FAST               1 (value)
             29 JUMP_ABSOLUTE           13
        >>   32 POP_BLOCK           

  4     >>   33 LOAD_CONST               0 (None)
             36 RETURN_VALUE        
>>>

ANSWER 3

Score 5

You have run every test only once. The speed of your CPU isn't the same all the time, at the start of the test it was most probably sleeping and that's why the first test was slower. For benchmarking small parts of code (like mod) use timeit module:

>>> timeit.timeit('for i in range(10000): i % 65536', number=1000)
0.8686108589172363
>>> timeit.timeit('for i in range(10000): i % 256**2', number=1000)
0.862062931060791
>>> timeit.timeit('for i in range(10000): i % 4**8', number=1000)
0.8644928932189941
>>> timeit.timeit('for i in range(10000): i % 2**16', number=1000)
0.8643178939819336
>>> timeit.timeit('for i in range(10000): i % 65536', number=1000)
0.8640358448028564

You can see that the average is always around 0.864.