In my web application, and a few other places I am using the first five or so digits of random MD5's, as a02f2 is less less revealing to the eye than 123, 124, etc and the string needs to be fairly simple to our users/us to remember.
My main concern is that the numbers will repeat, I believe the first numbers of the MD5 will not be dependent of eachother however I have heard voiced concerns over MD5 collision in web and other security applications.
Are the first 5 or so characters smart to use? Can you give me some solid examples of when exactly it would become an issue of collision?
3 replies to this topic
#1
Posted 13 September 2011 - 11:03 PM
|
|
|
#2
Posted 14 September 2011 - 12:21 AM
You could understand your choice as the following:
A rough formula, to avoid using extremely large numbers (i.e. 65536!) using the Taylor series expansion (of the exponential function e) can be used in finding the probability two random hashes may collide.
where n is 2, and k is 2^20, will equal 0.0000009 percent (or. 9/10000000 hashes)
In reverse you can also calculate n from your specified p using the natural logarithm of e. p will be 0.5 in the following (fifty percent chance is more than significant)
1205 hashes will theoretically contain a 50 percent chance of collision, and is significant enough and also dangerously likely if you are using it to keep track of files or records.
If you could use at least the full lower case range,with numbers, that will easily increase the amount of possible combinations (26^5 = 11 mil) and you may need to worry less of the issue.
Some references you could use, based on well known problems used to describe this:
Birthday attack - Wikipedia, the free encyclopedia
Birthday problem - Wikipedia, the free encyclopedia
Alexander.
- Base 16 (hexadecimal), allows 16 unique symbols to represent itself
- 16*16*16*16*16, or 16^5, can represent the equivalent of twenty bits of information.
A rough formula, to avoid using extremely large numbers (i.e. 65536!) using the Taylor series expansion (of the exponential function e) can be used in finding the probability two random hashes may collide.

where n is 2, and k is 2^20, will equal 0.0000009 percent (or. 9/10000000 hashes)
In reverse you can also calculate n from your specified p using the natural logarithm of e. p will be 0.5 in the following (fifty percent chance is more than significant)

1205 hashes will theoretically contain a 50 percent chance of collision, and is significant enough and also dangerously likely if you are using it to keep track of files or records.
If you could use at least the full lower case range,with numbers, that will easily increase the amount of possible combinations (26^5 = 11 mil) and you may need to worry less of the issue.
Some references you could use, based on well known problems used to describe this:
Birthday attack - Wikipedia, the free encyclopedia
Birthday problem - Wikipedia, the free encyclopedia
Alexander.
Attached Files
Edited by Alexander, 14 September 2011 - 03:47 AM.
Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
#3
Posted 14 September 2011 - 03:54 AM
I had never realized it could be calculated even by hand, and that the character range played a huge® part in the amount permutations. I see no problem in adding more of the characters, there was an option to use either five or ten length however I'll increase that a little depending on what they wish.
Thank you for the informative writeup Alexander.
A small request, and you do not have to do this: Is it possible to generate a graph (in a simple manner) to see the formula in action for myself?
Thank you for the informative writeup Alexander.
A small request, and you do not have to do this: Is it possible to generate a graph (in a simple manner) to see the formula in action for myself?
Edited by Fighter, 14 September 2011 - 04:01 AM.
Typo
#4
Posted 14 September 2011 - 04:04 AM
You are welcome, there are a lot of little neat functions out there to calculate these things.
Have you heard of Mathematica? They've an online service called Wolfram|Alpha which shares a fair amount of features, you can plug this in (excuse the encoded url) and it should give you the appropriate graph:
The right hand column is for the iterator, telling it to make n go from 0 to 4000 on the x axis.
Alexander.
Fighter said:
A small request, and you do not have to do this: Is it possible to generate a graph (in a simple manner) to see the formula in action for myself?
Have you heard of Mathematica? They've an online service called Wolfram|Alpha which shares a fair amount of features, you can plug this in (excuse the encoded url) and it should give you the appropriate graph:
[url=http://www.wolframalpha.com/input/?i=Plot%5B1-e^%28-n^2%2F%282*16^5%29%29+%2C+{n%2C0%2C4000}%5D]Plot[1-e^(-n^2/(2*16^5)) , {n,0,4000}] - Wolfram|Alpha[/url]
The right hand column is for the iterator, telling it to make n go from 0 to 4000 on the x axis.
Alexander.
Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users


Sign In
Create Account


Back to top










