Can hash algorithm such as MD5/SHA-1 generate an ID with less probability of collision than pure random number?
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Cosmic Puzzle
--
Chapters
00:00 Can Hash Algorithm Such As Md5/Sha-1 Generate An Id With Less Probability Of Collision Than Pure Ran
01:05 Accepted Answer Score 2
02:18 Thank you
--
Full question
https://stackoverflow.com/questions/5123...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #random #md5 #uuid
#avk47
    Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Cosmic Puzzle
--
Chapters
00:00 Can Hash Algorithm Such As Md5/Sha-1 Generate An Id With Less Probability Of Collision Than Pure Ran
01:05 Accepted Answer Score 2
02:18 Thank you
--
Full question
https://stackoverflow.com/questions/5123...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #random #md5 #uuid
#avk47
ACCEPTED ANSWER
Score 2
Generating X number of bytes of random data gives exactly the same collision probability as using the hash function on some ID's...
ASSUMING...
- The columns you're using the hash function on are themselves unique.
 - You haven't made mistakes doing #1
 
I would recommend using the system's cryptographic random number provider. Because you've probably made mistakes. Here's an easy one:
Your system: Concatenate column 1 and column 2, and hash the result. You can guarantee you'll never ever do this on those values of column 1 and column 2 ever again. NEVER.
What about when:
- Column 1 = "abc"
 - Column 2 = "def"
 
vs
- Column 1 = "ab"
 - Column 2 = "cdef"
 
Those would create the same hash function.
So who would you trust more to give you random data? Yourself? Or a team of operating system developers including cryptography experts and decades of research and experience? :)
Go with the system's cryptographic random function.