Hi, this a quite strange question but I need your help.
I have an executable file written many years ago, probably in Fortran, which I need to decompile since the source code is impossible to get.
I have tried with IDA Pro 5.5 and I have been able to watch the Assembly code, but I haven't been able to generate the C code. I get an error saying that this software doesn't work with 16 bits functions. Therefore, I guess that the program is compiled in 16 bit even if I am executing it in a 32 bits Windows 7 machine.
Do you know any decompilers for 16 bits? Do I have to use a Fortran / Cobol decompiler or can I translate it to any other language like C?
6 replies to this topic
#1
Posted 31 March 2011 - 07:30 AM
|
|
|
#2
Posted 31 March 2011 - 09:25 AM
I am unaware of any 16 bit disassemblers laying around the internet readily available, although I must warn you that what you get out of that disassembled source cannot be translated in to reusable C, it may just be a pile of inline assembly and jumps.
Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
#3
Posted 31 March 2011 - 03:55 PM
Thanks for your help but it doesn't really help...
Why do you say that disassembled source cannot be translated in to reusable C? It could be C or any other high level language. Don't the decompilers work propperly?
I would like to try anyway with a 16 bit one.
Thanks!
Why do you say that disassembled source cannot be translated in to reusable C? It could be C or any other high level language. Don't the decompilers work propperly?
I would like to try anyway with a 16 bit one.
Thanks!
#4
Posted 31 March 2011 - 05:56 PM
Most decompilers I've seen only go to Assembly.
#5
Posted 31 March 2011 - 06:05 PM
To put you on the right track I would assume your application was compiled under MS-DOS environment (i.e. below Windows 95), you would need to search for an MS-DOS 16-bit disassembler.
However, you still have the problem I mentioned, there is no valid way to produce readable C code out of this assembly, in fact most converters will generate as I mentioned a C program mainly containing inline assemblies and VERY low level functioning, nothing will be usable out of it unless you fundamentally understand what it does. I am not even aware of any that does this other than IDA Pro.
However, you still have the problem I mentioned, there is no valid way to produce readable C code out of this assembly, in fact most converters will generate as I mentioned a C program mainly containing inline assemblies and VERY low level functioning, nothing will be usable out of it unless you fundamentally understand what it does. I am not even aware of any that does this other than IDA Pro.
Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
#6
Posted 01 April 2011 - 03:37 AM
Thank you very much for your contribution, I agree with your view about having been compiled in a 16 bit MS-DOS environment. However, I now something about computer science but obviously not enough... MS-DOS is an operating system, as long as I know. Therefore, why do you say that i need an MS-DOS 16-bit disassembler? I mean... why is the OS important, rather than the original language in which it was written?
Another question... Would you assume that it was written in Fortran or Cobol if it was written in the early 90s or perhaps in the late 80s?
And finally, just to make me clear... I didn't want to say that I need the ORIGINAL source code. That would be great but I know that it would also be magic. I want a HIGH LEVEL CODE that is simpler to understand than the machine language. Of course that I would have to work hard to get an idea of how the original software worked with a decompilation (which is my intention, and I do have time and the need to do that). But I don't want to start with the Assembly or pseudo-Assembly code. For you to get an idea... it would require 4.000 pages to print them all. It's obvious that I need a previous simplification made with a computer software :-)
Another question... Would you assume that it was written in Fortran or Cobol if it was written in the early 90s or perhaps in the late 80s?
And finally, just to make me clear... I didn't want to say that I need the ORIGINAL source code. That would be great but I know that it would also be magic. I want a HIGH LEVEL CODE that is simpler to understand than the machine language. Of course that I would have to work hard to get an idea of how the original software worked with a decompilation (which is my intention, and I do have time and the need to do that). But I don't want to start with the Assembly or pseudo-Assembly code. For you to get an idea... it would require 4.000 pages to print them all. It's obvious that I need a previous simplification made with a computer software :-)
#7
Posted 02 April 2011 - 04:23 AM
From what I could gather you have this program compiled for a Windows environment, Windows 95 supports 32 bit addressing and your application likely came from this era, or rather before when only 16 bit addressing was available. Before Windows 95, Windows was just a shell of DOS, so that is why I would assume you would need an MS-DOS related binary disassembly tool as otherwise it would have likely been compiled in 32 bit (even if not the case, I am not aware of any 32 bit 16 bit disassemblers!)
I hope you have been able to find a disassembler, but to justify my point entirely, look at this silly example psuedo-reproduction from a 90's program I just reassembled with HexRays (an IDA Pro Advanced addon, which you can consider as a more expensive reconstruction tool of assembly), note this is a 500kb executable and I am showing you a small portion of the 2mb source it has reproduced:
I hope you have been able to find a disassembler, but to justify my point entirely, look at this silly example psuedo-reproduction from a 90's program I just reassembled with HexRays (an IDA Pro Advanced addon, which you can consider as a more expensive reconstruction tool of assembly), note this is a 500kb executable and I am showing you a small portion of the 2mb source it has reproduced:
//----- (0101FCE0) --------------------------------------------------------
int __fastcall sub_101FCE0(int a1)
{
return *(_DWORD *)(a1 + 76);
}
//----- (0101FCE9) --------------------------------------------------------
void __fastcall sub_101FCE9(int a1)
{
int v1; // esi@1
v1 = a1;
*(_DWORD *)(a1 + 76) = 0;
*(_DWORD *)a1 = &off_1001CC4;
sub_103A31D(a1 + 8);
sub_100E612(v1);
}
//----- (0101FD0D) --------------------------------------------------------
char __thiscall sub_101FD0D(int this, int a2, int a3, int a4, int a5, unsigned int a6)
{
int v6; // ebx@1
int v7; // edi@1
int v8; // esi@1
unsigned int v10; // eax@1
unsigned int v11; // ST10_4@1
int v12; // ST0C_4@1
int v13; // ST00_4@1
int v14; // esi@3
unsigned int v15; // esi@3
v10 = a6;
v6 = a3;
v8 = a4;
v11 = a6;
v12 = a5;
v7 = this;
*(_DWORD *)(this + 76) = 0;
v13 = a2;
*(_DWORD *)(this + 80) = v10;
if ( !sub_103A368(this + 8, v13, v6, v8, v12, v11, 0) )
{
sub_103C381();
LABEL_4:
*(_DWORD *)(v7 + 76) = 0;
return 0;
}
a6 = *(_DWORD *)(v7 + 24);
v14 = (unsigned __int16)(*(int (__thiscall **)(int))(*(_DWORD *)v8 + 52))(v8);
v15 = a6 * (*(int (__thiscall **)(int))(*(_DWORD *)v6 + 8))(v6) * v14 >> 5;
*(_DWORD *)(v7 + 76) = v15;
if ( !v15 )
goto LABEL_4;
return 1;
}
//----- (0101FD81) --------------------------------------------------------
int __thiscall sub_101FD81(int this, char a2)
{
int v2; // esi@1
v2 = this;
sub_101FCE9(this);
if ( a2 & 1 )
sub_1013F78(v2);
return v2;
}
//----- (0101FDA7) --------------------------------------------------------
char __stdcall sub_101FDA7(char a1, char a2)
{
char result; // al@1
byte_1062040 = a1;
result = a2;
byte_1062041 = a2;
return result;
}
//----- (0101FDC5) --------------------------------------------------------
int __stdcall sub_101FDC5(int a1, int a2)
{
int result; // eax@1
*(_BYTE *)a1 = byte_1062040;
result = a2;
*(_BYTE *)a2 = byte_1062041;
return result;
}
//----- (0101FDE9) --------------------------------------------------------
__int16 __userpurge sub_101FDE9<ax>(int a1<esi>, int a2, unsigned int a3)
{
int v3; // eax@1
int v4; // ecx@1
int v5; // edx@5
int v6; // edx@9
int v7; // ecx@9
int v8; // esi@9
v4 = a2;
LOWORD(v3) = *(_WORD *)(a2 + 4);
LOWORD(a1) = *(_WORD *)(a2 + 6);
if ( !(a3 & 0x1FF) )
{
if ( !(v3 & 1) )
{
v3 = (unsigned __int16)v3;
if ( (unsigned __int16)v3 + 2 * (unsigned int)(unsigned __int16)a1 <= a3 )
{
if ( (a3 >> 9) + 1 == (unsigned __int16)a1 )
{
v3 += a2;
v5 = *(_WORD *)v3;
do
{
do
++v5;
while ( !(_WORD)v5 );
}
while ( (_WORD)v5 == -1 );
*(_WORD *)v3 = v5;
if ( (_WORD)a1 > 1u )
{
v7 = v4 + 510;
v6 = v3 + 2;
v8 = (unsigned __int16)(a1 - 1);
do
{
*(_WORD *)v6 = *(_WORD *)v7;
*(_WORD *)v7 = *(_WORD *)v3;
v7 += 512;
v6 += 2;
--v8;
}
while ( v8 );
}
}
}
}
}
return v3;
}
//----- (0101FE6C) --------------------------------------------------------
char __stdcall sub_101FE6C(int a1)
{
unsigned int v1; // edx@1
unsigned int v2; // ecx@2
signed int v3; // eax@3
char v4; // zf@9
char result; // al@16
v1 = *(_BYTE *)(a1 + 64);
if ( v1 )
{
v2 = 0;
if ( v1 )
{
while ( 1 )
{
LOWORD(v3) = *(_WORD *)(a1 + 2 * v2 + 66);
if ( (_WORD)v3 < 0x20u )
goto LABEL_18;
v3 = (unsigned __int16)v3;
if ( (signed int)(unsigned __int16)v3 <= 60 )
break;
if ( v3 >= 62 )
{
if ( v3 <= 63 || v3 == 92 )
goto LABEL_18;
v4 = v3 == 124;
goto LABEL_14;
}
LABEL_15:
++v2;
if ( v2 >= v1 )
goto LABEL_16;
}
if ( (unsigned __int16)v3 == 60 || v3 == 34 || v3 == 42 || v3 == 47 )
goto LABEL_18;
v4 = v3 == 58;
LABEL_14:
if ( v4 )
goto LABEL_18;
goto LABEL_15;
}
LABEL_16:
result = 1;
}
else
{
LABEL_18:
result = 0;
}
return result;
}
//----- (0101FED5) --------------------------------------------------------
char __stdcall sub_101FED5(int a1)
{
int v1; // eax@1
unsigned int v2; // ebx@1
int v3; // edi@1
signed int v4; // esi@1
unsigned int v5; // edx@2
signed int v6; // ecx@3
v3 = a1;
v4 = *(_BYTE *)(a1 + 64);
v1 = 0;
v2 = *(_BYTE *)(a1 + 64);
if ( (unsigned int)v4 <= 0xC )
{
v5 = 0;
BYTE3(a1) = 0;
if ( v4 )
{
do
{
LOWORD(v6) = *(_WORD *)(v3 + 2 * v5 + 66);
if ( (_WORD)v6 < 0x20u )
return 0;
v6 = (unsigned __int16)v6;
if ( (signed int)(unsigned __int16)v6 > 47 )
{
if ( v6 >= 58 && (v6 <= 63 || v6 > 90 && (v6 <= 93 || v6 == 124)) )
return 0;
}
else
{
if ( (unsigned __int16)v6 == 47 || v6 == 34 )
return 0;
if ( v6 > 41 )
{
if ( v6 <= 44 )
return 0;
if ( v6 == 46 )
{
if ( BYTE3(a1) )
return 0;
BYTE3(a1) = 1;
v2 = v5;
v1 = v4 - v5 - 1;
}
}
}
++v5;
}
while ( v5 < v4 );
}
if ( v2 )
{
if ( v2 <= 8 && *(_WORD *)(v3 + 2 * v2 + 64) != 32 )
{
if ( !v1 )
return BYTE3(a1) == (_BYTE)v1;
if ( (unsigned int)v1 <= 3 && *(_WORD *)(v3 + 2 * (v2 + v1) + 66) != 32 )
return 1;
}
}
else
{
if ( BYTE3(a1) != (_BYTE)v2 && v4 == 1 )
return 1;
}
}
return 0;
}
//----- (0101FFA0) --------------------------------------------------------
char __stdcall sub_101FFA0(int a1)
{
int v1; // ebx@1
unsigned int v2; // esi@1
char result; // al@2
unsigned int v4; // edi@3
v1 = a1;
v2 = *(_DWORD *)(a1 + 12);
if ( v2 <= 0x20 )
{
v4 = 0;
if ( v2 )
{
while ( (unsigned __int16)sub_1011E5D(v1, v4) >= 0x20u )
{
++v4;
if ( v4 >= v2 )
goto LABEL_6;
}
result = 0;
}
else
{
LABEL_6:
result = 1;
}
}
else
{
result = 0;
}
return result;
}
//----- (0101FFE2) --------------------------------------------------------
char __fastcall sub_101FFE2(int a1)
{
return *(_BYTE *)(a1 + 74);
}
//----- (0101FFEB) --------------------------------------------------------
char __fastcall sub_101FFEB(int a1)
{
int v1; // eax@1
v1 = *(_DWORD *)(a1 + 8);
return !*(_DWORD *)(v1 + 32) && !*(_WORD *)(v1 + 36) && !*(_WORD *)(v1 + 38);
}
//----- (0102000C) --------------------------------------------------------
int __fastcall sub_102000C(int a1)
{
int result; // eax@2
if ( *(_BYTE *)(*(_DWORD *)(a1 + 8) + 8) == 1 )
result = 0;
else
result = *(_DWORD *)(a1 + 8) + *(_WORD *)(*(_DWORD *)(a1 + 8) + 20);
return result;
}
//----- (01020027) --------------------------------------------------------
int __fastcall sub_1020027(int a1)
{
int result; // eax@1
*(_BYTE *)(a1 + 137) = 0;
*(_DWORD *)(a1 + 68) = 0;
result = 0;
memset((void *)(a1 + 72), 0, 0x2Cu);
*(_DWORD *)(a1 + 120) = 0;
*(_DWORD *)(a1 + 124) = 0;
*(_DWORD *)(a1 + 128) = 0;
*(_DWORD *)(a1 + 132) = 0;
*(_BYTE *)(a1 + 136) = 0;
*(_DWORD *)(a1 + 144) = 0;
*(_DWORD *)(a1 + 148) = 0;
*(_DWORD *)(a1 + 152) = 0;
*(_DWORD *)(a1 + 156) = 0;
return result;
}
//----- (01020078) --------------------------------------------------------
int __fastcall sub_1020078(int a1)
{
int result; // eax@1
*(_BYTE *)(a1 + 137) = 0;
*(_DWORD *)(a1 + 68) = 0;
result = 0;
memset((void *)(a1 + 72), 0, 0x2Cu);
*(_DWORD *)(a1 + 120) = 0;
*(_DWORD *)(a1 + 124) = 0;
*(_DWORD *)(a1 + 128) = 0;
*(_DWORD *)(a1 + 132) = 0;
*(_DWORD *)(a1 + 144) = 0;
*(_DWORD *)(a1 + 148) = 0;
*(_DWORD *)(a1 + 152) = 0;
*(_DWORD *)(a1 + 156) = 0;
return result;
}
//----- (010200C3) --------------------------------------------------------
void __fastcall sub_10200C3(int a1)
{
int v1; // esi@1
v1 = a1;
*(_DWORD *)a1 = &off_1001CDC;
sub_1020078(a1);
sub_1011010(v1 + 48);
sub_10193F7(v1);
}
//----- (010200E8) --------------------------------------------------------
int __fastcall sub_10200E8(int a1)
{
*(_WORD *)(*(_DWORD *)(a1 + 68) + 11) = *(_WORD *)(a1 + 72);
*(_BYTE *)(*(_DWORD *)(a1 + 68) + 13) = *(_BYTE *)(a1 + 74);
*(_WORD *)(*(_DWORD *)(a1 + 68) + 14) = *(_WORD *)(a1 + 76);
*(_BYTE *)(*(_DWORD *)(a1 + 68) + 16) = *(_BYTE *)(a1 + 78);
*(_WORD *)(*(_DWORD *)(a1 + 68) + 17) = *(_WORD *)(a1 + 80);
*(_WORD *)(*(_DWORD *)(a1 + 68) + 19) = *(_WORD *)(a1 + 82);
*(_BYTE *)(*(_DWORD *)(a1 + 68) + 21) = *(_BYTE *)(a1 + 84);
*(_WORD *)(*(_DWORD *)(a1 + 68) + 22) = *(_WORD *)(a1 + 86);
*(_WORD *)(*(_DWORD *)(a1 + 68) + 24) = *(_WORD *)(a1 + 88);
*(_WORD *)(*(_DWORD *)(a1 + 68) + 26) = *(_WORD *)(a1 + 90);
*(_DWORD *)(*(_DWORD *)(a1 + 68) + 28) = *(_DWORD *)(a1 + 92);
*(_DWORD *)(*(_DWORD *)(a1 + 68) + 32) = *(_DWORD *)(a1 + 96);
return *(_DWORD *)(a1 + 8);
}
The original source had about 20 pages of definitions and macros, this new source has roughly 200 pages of them all beautifully named with with mnemonics which I can assume is worse than the assembly.
Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users


Sign In
Create Account

Back to top









