Ok, for those who've been following the thread, here's an update from my side.
NOTE: Throughout this explanation, ctx, ctx1, ctx2, ctx3 ... are md5 contexts, final and final2 are md5 data arrays (unsigned char[16]) the same variable name refers to the same variable always ie, scope of variables is this mail.
1. Yahoo uses the exact same algorithm as pam does to generate MD5 passwords.
You can get the source for this from the pam package. Look in modules/pam_unix/md5_crypt.c:MD5Name(crypt_md5)()
There is only one minor change that does not affect the outcome of the code, and that is the calls to MD5_Update with ctx1 happen before the calls to MD5_Update with ctx (if you see the code, it'll make more sense, or I can explain then).
2. There are four more calls to the MD5 library. The first does this:
MD5_Init(&ctx2); MD5_Update(&ctx2, passwd, strlen(passwd)); MD5_Final(final2, &ctx2);
MD5_Init(&ctx3); MD5_Update(&ctx3, str, strlen(str)); MD5_Final(final2, &ctx3);
str is obtained by appending/prepending something to the username. That something is 49 bytes in length.
The last two calls are:
MD5_Init(&ctx4); MD5_Update(&ctx4, str2, strlen(str2)); MD5_Final(final2, &ctx4);
MD5_Init(&ctx5); MD5_Update(&ctx5, str3, strlen(str3)); MD5_Final(final2, &ctx5);
str3 is also derived from the username, again with a 49 byte string appended/prepended to it.
str2 seems to be 34 characters in length always.
This data was obtained through statistical analysis of function calls. We still haven't determined what the actual contents of str, str2, str3 or the salt are. Once we have this, I believe we will have cracked it completely.
If anyone wants to figure it out, you have to somehow get the data that's being passed to MD5_Update.
The first call is the password, the second call is with str. You need to figure out how str is derived.
The next three calls are password, magic ($1$ I think), salt (8 chars). The next three are password, salt, password.
Try and figure out how salt is derived, and if magic is different from $1$, what is it? Is it constant across calls?
Then, look at the last two calls to MD5_Update - calls number 3536 and 3537. No 3536 seems to be constant, but someone will have to confirm this. No 3537 is derived from the username. Figure out how.
Philip
Haven't been following the thread, but gonna jump in the none-the-less :)
Q: How do we know that str2, str3...really are strings concat with other strings. Could they be pointes to structures, or perhaps arbitary/binary memory buffers?
Here, I assume that when you say string, you are refering to an ASCIIZ string, not a binary string - strlen et al.
Regards, -Varun
On Fri, Apr 12, 2002 at 05:06:41PM +0530, Philip S Tellis spoke out thus:
Ok, for those who've been following the thread, here's an update from my side.
NOTE: Throughout this explanation, ctx, ctx1, ctx2, ctx3 ... are md5 contexts, final and final2 are md5 data arrays (unsigned char[16]) the same variable name refers to the same variable always ie, scope of variables is this mail.
- Yahoo uses the exact same algorithm as pam does to generate MD5
passwords.
You can get the source for this from the pam package. Look in modules/pam_unix/md5_crypt.c:MD5Name(crypt_md5)()
There is only one minor change that does not affect the outcome of the code, and that is the calls to MD5_Update with ctx1 happen before the calls to MD5_Update with ctx (if you see the code, it'll make more sense, or I can explain then).
- There are four more calls to the MD5 library. The first does this:
MD5_Init(&ctx2); MD5_Update(&ctx2, passwd, strlen(passwd)); MD5_Final(final2, &ctx2);
MD5_Init(&ctx3); MD5_Update(&ctx3, str, strlen(str)); MD5_Final(final2, &ctx3);
str is obtained by appending/prepending something to the username. That something is 49 bytes in length.
The last two calls are:
MD5_Init(&ctx4); MD5_Update(&ctx4, str2, strlen(str2)); MD5_Final(final2, &ctx4);
MD5_Init(&ctx5); MD5_Update(&ctx5, str3, strlen(str3)); MD5_Final(final2, &ctx5);
str3 is also derived from the username, again with a 49 byte string appended/prepended to it.
str2 seems to be 34 characters in length always.
This data was obtained through statistical analysis of function calls. We still haven't determined what the actual contents of str, str2, str3 or the salt are. Once we have this, I believe we will have cracked it completely.
If anyone wants to figure it out, you have to somehow get the data that's being passed to MD5_Update.
The first call is the password, the second call is with str. You need to figure out how str is derived.
The next three calls are password, magic ($1$ I think), salt (8 chars). The next three are password, salt, password.
Try and figure out how salt is derived, and if magic is different from $1$, what is it? Is it constant across calls?
Then, look at the last two calls to MD5_Update - calls number 3536 and 3537. No 3536 seems to be constant, but someone will have to confirm this. No 3537 is derived from the username. Figure out how.
Philip
-- Spock: We suffered 23 casualties in that attack, Captain.
linux-india-programmers mailing list linux-india-programmers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-india-programmers
On Fri, 12 Apr 2002, Varun Varma wrote:
Q: How do we know that str2, str3...really are strings concat with other strings. Could they be pointes to structures, or perhaps arbitary/binary memory buffers?
second parameter to MD5_Update is a string, and third parameter is the length of the string.
Here, I assume that when you say string, you are refering to an ASCIIZ string, not a binary string - strlen et al.
yes, although it need not be zero terminated. The string length is passed as third parameter, and need not always be the same as the actual length of the string.
On Fri, Apr 12, 2002 at 07:12:53PM +0530, Philip S Tellis spoke out thus:
On Fri, 12 Apr 2002, Varun Varma wrote:
Q: How do we know that str2, str3...really are strings concat with other strings. Could they be pointes to structures, or perhaps arbitary/binary memory buffers?
second parameter to MD5_Update is a string, and third parameter is the length of the string.
Isn't it supposed to be void *? Even if it is unsigned char *, the values being passed might be a typecasted struct.
Question is, do you know by knowledge that these are supposed to be strings, or is that based obversing output from ltrace?
Here, I assume that when you say string, you are refering to an ASCIIZ string, not a binary string - strlen et al.
yes, although it need not be zero terminated. The string length is passed as third parameter, and need not always be the same as the actual length of the string.
strlen won't work on anything but null terminated strings. You've show using strlen as the third parameter, and that won't work unless one assumes that there are good old \0 terminated strings. Unless, you put in strlen symbolically, rather than syntatically, i.e. to show the length of the data in parameter 2 should be passed as parameter 3.
Regards, -Varun
Sometime on Apr 12, Varun Varma assembled some asciibets to say:
second parameter to MD5_Update is a string, and third parameter is the length of the string.
Isn't it supposed to be void *? Even if it is unsigned char *, the values being passed might be a typecasted struct.
Question is, do you know by knowledge that these are supposed to be strings, or is that based obversing output from ltrace?
It is based on three things. 1. Output of ltrace 2. Knowledge of algorithm used to generate MD5 passwords by pam 3. Assumption that yahoo would rather use a tried and tested secure technique, rather than make changes and risk it being insecure.
For the situation of encrypting the username, it could be a struct rather than just a string, but consider this:
struct { char sometext[49]; char username[?]; };
They have to specify some size for username, because they cannot use a char *. If they did, that would result in the address of username being hashed, and not the username itself.
If they do specify a fixed size, then they are letting themselves in for a buffer overflow attack if someone uses a username larger than the array size. Therefore, I doubt it is anything other than a plain string.
On a related note, I had succeeded in crashing other people's yahoo clients (official clients as well as alternative clients) using an implementation of conferencing in everybuddy. I cannot reproduce this now as I can no longer log in. That, and the people I tested with won't cooperate any more :)
yes, although it need not be zero terminated. The string length is passed as third parameter, and need not always be the same as the actual length of the string.
strlen won't work on anything but null terminated strings. You've show using strlen as the third parameter, and that won't work unless
no, that was symbolic. The strlen function needn't be used there, it may be an absolute integer (in some cases it is).