Why is taking the mod of a number in python faster with exponents?
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Island
--
Chapters
00:00 Question
02:48 Accepted answer (Score 19)
04:36 Answer 2 (Score 6)
06:08 Answer 3 (Score 4)
06:45 Thank you
--
Full question
https://stackoverflow.com/questions/1055...
Answer 1 links:
[dis]: http://docs.python.org/library/dis.html
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #optimization #profiling
#avk47
ACCEPTED ANSWER
Score 20
There is no difference in the generated bytecode, because the compiler does its job well and optimizes away the constant arithmetic expression. That means your test results are just a coincidence (try timing the functions in a different order!).
>>> import dis
>>> dis.dis(test1)
2 0 SETUP_LOOP 30 (to 33)
3 LOAD_GLOBAL 0 (xrange)
6 LOAD_GLOBAL 1 (AMOUNT)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 16 (to 32)
16 STORE_FAST 0 (i)
3 19 LOAD_FAST 0 (i)
22 LOAD_CONST 1 (65536)
25 BINARY_MODULO
26 STORE_FAST 1 (value)
29 JUMP_ABSOLUTE 13
>> 32 POP_BLOCK
4 >> 33 LOAD_CONST 0 (None)
36 RETURN_VALUE
>>> dis.dis(test5)
2 0 SETUP_LOOP 30 (to 33)
3 LOAD_GLOBAL 0 (xrange)
6 LOAD_GLOBAL 1 (AMOUNT)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 16 (to 32)
16 STORE_FAST 0 (i)
3 19 LOAD_FAST 0 (i)
22 LOAD_CONST 3 (65536)
25 BINARY_MODULO
26 STORE_FAST 1 (value)
29 JUMP_ABSOLUTE 13
>> 32 POP_BLOCK
4 >> 33 LOAD_CONST 0 (None)
36 RETURN_VALUE
(well actually there is a difference: The number is stored at different offsets in the constant table. I can't imagine this causing any difference, though).
For completeness, here's a proper test that uses the timeit module:
import timeit
setup = "i = 1337"
best1 = best2 = float("inf")
for _ in range(5000):
best1 = min(best1, timeit.timeit("i % 65536", setup=setup, number=10000))
for _ in range(5000):
best2 = min(best2, timeit.timeit("i % (2**16)", setup=setup, number=10000))
print best1
print best2
Note that I am measuring the minimum time needed, rather than the average. If it takes longer for some reason, this just means that it was interrupted more often (because the code doesn't depend on anything but the power of your CPU).
ANSWER 2
Score 6
Hmmm, using dis to show the python bytes codes shows that the functions are identical. Python has optimised the constant (as expected). So I suspect the time differences are caching effects. The timings on my laptop bear this out (using Python 2.7.3 64 bit on linux)
Fri May 11 23:37:49 2012 results
8 function calls in 38.825 seconds
Ordered by: call count, name/file/line
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 38.825 38.825 <string>:1(<module>)
1 0.000 0.000 38.825 38.825 z.py:31(run_tests)
1 7.880 7.880 7.880 7.880 z.py:6(test1)
1 7.658 7.658 7.658 7.658 z.py:11(test2)
1 7.806 7.806 7.806 7.806 z.py:16(test3)
1 7.784 7.784 7.784 7.784 z.py:21(test4)
1 7.697 7.697 7.697 7.697 z.py:26(test5)
All pretty much identical
>>> from dis import dis
>>> def test1():
... for i in xrange(AMOUNT):
... value = i % 65536
... return
...
>>> def test5():
... for i in xrange(AMOUNT):
... value = i % (2**16)
... return
...
>>> dis(test1)
2 0 SETUP_LOOP 30 (to 33)
3 LOAD_GLOBAL 0 (xrange)
6 LOAD_GLOBAL 1 (AMOUNT)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 16 (to 32)
16 STORE_FAST 0 (i)
3 19 LOAD_FAST 0 (i)
22 LOAD_CONST 1 (65536)
25 BINARY_MODULO
26 STORE_FAST 1 (value)
29 JUMP_ABSOLUTE 13
>> 32 POP_BLOCK
4 >> 33 LOAD_CONST 0 (None)
36 RETURN_VALUE
>>> dis(test5)
2 0 SETUP_LOOP 30 (to 33)
3 LOAD_GLOBAL 0 (xrange)
6 LOAD_GLOBAL 1 (AMOUNT)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 16 (to 32)
16 STORE_FAST 0 (i)
3 19 LOAD_FAST 0 (i)
22 LOAD_CONST 3 (65536)
25 BINARY_MODULO
26 STORE_FAST 1 (value)
29 JUMP_ABSOLUTE 13
>> 32 POP_BLOCK
4 >> 33 LOAD_CONST 0 (None)
36 RETURN_VALUE
>>>
ANSWER 3
Score 5
You have run every test only once. The speed of your CPU isn't the same all the time, at the start of the test it was most probably sleeping and that's why the first test was slower. For benchmarking small parts of code (like mod) use timeit module:
>>> timeit.timeit('for i in range(10000): i % 65536', number=1000)
0.8686108589172363
>>> timeit.timeit('for i in range(10000): i % 256**2', number=1000)
0.862062931060791
>>> timeit.timeit('for i in range(10000): i % 4**8', number=1000)
0.8644928932189941
>>> timeit.timeit('for i in range(10000): i % 2**16', number=1000)
0.8643178939819336
>>> timeit.timeit('for i in range(10000): i % 65536', number=1000)
0.8640358448028564
You can see that the average is always around 0.864.