The Python Oracle

Is the builtin hash method of Python2.6 stable across architectures?

This video explains
Is the builtin hash method of Python2.6 stable across architectures?

--

Become part of the top 3% of the developers by applying to Toptal
https://topt.al/25cXVn

--

Track title: CC C Schuberts Piano Sonata No 13 D

--

Chapters
00:00 Question
00:39 Accepted answer (Score 11)
00:52 Answer 2 (Score 9)
02:50 Answer 3 (Score 6)
03:09 Answer 4 (Score 5)
03:29 Thank you

--

Full question
https://stackoverflow.com/questions/5583...

Accepted answer links:
[hashlib]: http://docs.python.org/library/hashlib.h...

Answer 2 links:
[hashable]: http://docs.python.org/glossary.html#ter...
[id()]: http://docs.python.org/library/functions...

Answer 3 links:
[hashlib]: http://docs.python.org/library/hashlib.h...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python

#avk47



ACCEPTED ANSWER

Score 12


If you need a well defined hash, you can use one out of hashlib.




ANSWER 2

Score 9


The hash() function is not what you want; finding a reliable way to serialize the object (eg str() or repr()) and running it through hashlib.md5() would probably be much more preferrable.

In detail - hash() is designed to return an integer which uniquely identifies an object only within it's lifetime. Once the program is run again, constructing a new object may in fact have a different hash. Destroying an object means there's a chance another object will have that hash in the future. See python's definition of hashable for more.

Behind the scenes, most user-defined python objects fall back to id() to provide their hash value. While you're not supposed to make use of this, id(obj) and thus hash(obj) is usually implemented (eg in CPython) as the memory address of the underlying Python object. Thus you can see why it can't be relied on for anything.

The behavior you currently see is only reliable for certain builtin python objects, and that not very far. hash({}) for instance is not possible.


Regarding hashlib.md5(str(obj)) or equivalent - you'll need to make sure str(obj) is reliably the same. In particular, if you have a dictionary being rendering within that string, it may not list it's keys in the same order. There may also be subtle differences between python versions... I would definitely recommend unittests for any implementation you rely on.




ANSWER 3

Score 6


No.

x86_64
>>> print hash("a")
12416037344

i386
>>> print hash("a")
-468864544

If you need a stable hash, create a digest of your data using something like sha1, which can be found in hashlib




ANSWER 4

Score 5


No. On ARM with python 2.6:

>>> hash((1,2,3,4)) 

89902565