Jump to content

Can anyone help me understand this asm code (It's short)

- - - - -

  • Please log in to reply
9 replies to this topic

#1
Renato_Motta

Renato_Motta

    Newbie

  • Members
  • Pip
  • 4 posts
I'm trying to learn shellcode for a project
in comp science

but I’m having a bit of a problem writing it
I’m reading a book called The Shellcoder's Handbook
and it gives me a code that wont work properly
This is the code:

 section     .text

    global _start

_start:

    jmp short      GotoCall

shellcode:

     pop            rsi

     xor            eax, eax

     mov  byte      [esi + 7], al

     lea            ebx, [esi]

     mov  long      [esi + 8], ebx

     mov  long      [esi + 12], eax

     mov  byte      al, 0x0b

     mov            ebx, esi

     lea            ecx, [esi + 8]

     lea            edx, [esi + 12]

     int            0x80

GotoCall:

     Call            shellcode

     db             '/bin/shJAAAAKKKK'

simply put this is supposed to spawn a shell...
but it wont work and when i use gdb to debug it
i get a weird code
this is the gdb output:
gdb ./sclivro

Dump of assembler code for function _start:

0x0000000000400080 <_start+0>:	jmp    0x4000a2 <_start+34>

0x0000000000400082 <_start+2>:	pop    %rsi

0x0000000000400083 <_start+3>:	xor    %eax,%eax

0x0000000000400085 <_start+5>:	addr32 mov %al,0x7(%esi)

0x0000000000400089 <_start+9>:	addr32 lea (%esi),%ebx

0x000000000040008c <_start+12>:	addr32 mov %ebx,0x8(%esi)

0x0000000000400090 <_start+16>:	addr32 mov %eax,0xc(%esi)

0x0000000000400094 <_start+20>:	mov    $0xb,%al

0x0000000000400096 <_start+22>:	mov    %esi,%ebx

0x0000000000400098 <_start+24>:	addr32 lea 0x8(%esi),%ecx

0x000000000040009c <_start+28>:	addr32 lea 0xc(%esi),%edx

0x00000000004000a0 <_start+32>:	int    $0x80

0x00000000004000a2 <_start+34>:	callq  0x400082 <_start+2>

### What is all this???####

0x00000000004000a7 <_start+39>:	(bad)  

0x00000000004000a8 <_start+40>:	(bad)  

0x00000000004000a9 <_start+41>:	imul   $0x414a6873,0x2f(%rsi),%ebp

0x00000000004000b0 <_start+48>:	rex.B

0x00000000004000b1 <_start+49>:	rex.B

0x00000000004000b2 <_start+50>:	rex.B

0x00000000004000b3 <_start+51>:	rex.WXB

0x00000000004000b4 <_start+52>:	rex.WXB

0x00000000004000b5 <_start+53>:	rex.WXB

0x00000000004000b6 <_start+54>:	rex.WXB

End of assembler dump.

I compile the code using yasm and ld
yasm -f elf64 sclivro.asm
ld -o sclivro sclivro.o

i get a segmentation fault error

My OS is Debian 6.0 x64

I have a Intel Celeron processor

My question is... what is all that code below my comment

0x00000000004000a7 <_start+39>:	(bad)  

0x00000000004000a8 <_start+40>:	(bad)  

0x00000000004000a9 <_start+41>:	imul   $0x414a6873,0x2f(%rsi),%ebp

0x00000000004000b0 <_start+48>:	rex.B

0x00000000004000b1 <_start+49>:	rex.B

0x00000000004000b2 <_start+50>:	rex.B

0x00000000004000b3 <_start+51>:	rex.WXB

0x00000000004000b4 <_start+52>:	rex.WXB

0x00000000004000b5 <_start+53>:	rex.WXB

0x00000000004000b6 <_start+54>:	rex.WXB

why is that code there??
and what am i doing wrong?
thanks for your time.

#2
RhetoricalRuvim

RhetoricalRuvim

    JavaScript Programmer

  • Members
  • PipPipPipPipPipPipPipPip
  • 1,251 posts
  • Location:C:\Countries\US

Quote

section     .text

    global _start

_start:

    jmp short      GotoCall

shellcode:

     pop            rsi

     xor            eax, eax

     mov  byte      [esi + 7], al

     lea            ebx, [esi]

     mov  long      [esi + 8], ebx

     mov  long      [esi + 12], eax

     mov  byte      al, 0x0b

     mov            ebx, esi

     lea            ecx, [esi + 8]

     lea            edx, [esi + 12]

     int            0x80

GotoCall:

     Call            shellcode

     db             '/bin/shJAAAAKKKK'
At the label shellcode, what are you trying to do by "pop rsi" ? Are you trying to get the address of the next instruction after the call statement at the label GotoCall (because that's how it appears to be)? After the call statement at the GotoCall label, aren't you supposed to do something like return or exit? And right before the GotoCall label, aren't you supposed to either exit the program or return from the function call (that you called at the GotoCall label)?

Edit: Oh, and another thing: shouldn't the "db '/bin/shJAAAAKKKK'" be in the .data section?
;; such as: 

section .data 

db '/bin/shJAAAAKKKK' 
And also, don't you want to label it? Like this:
section .data 

some_label_for_some_string_001        db '/bin/shJAAAAKKKK' 
?

Edit:

Quote

### What is all this???####

0x00000000004000a7 <_start+39>:	(bad)  

0x00000000004000a8 <_start+40>:	(bad)  

0x00000000004000a9 <_start+41>:	imul   $0x414a6873,0x2f(%rsi),%ebp

0x00000000004000b0 <_start+48>:	rex.B

0x00000000004000b1 <_start+49>:	rex.B

0x00000000004000b2 <_start+50>:	rex.B

0x00000000004000b3 <_start+51>:	rex.WXB

0x00000000004000b4 <_start+52>:	rex.WXB

0x00000000004000b5 <_start+53>:	rex.WXB

0x00000000004000b6 <_start+54>:	rex.WXB

End of assembler dump.
That's probably the disassembled version of this machine code:
/bin/shJAAAAKKKK
Even though it's not machine code, you can't prove to the processor that it's not, unless you tell it to stay out of that area. If the processor gets there then, no matter how much it doesn't make sense, the processor would try to execute whatever is there.

Edited by RhetoricalRuvim, 13 February 2011 - 03:48 PM.


#3
Renato_Motta

Renato_Motta

    Newbie

  • Members
  • Pip
  • 4 posts
Yes when i pop rsi i try to get the address of
db '/bin/shJAAAAKKKK'
The call instruction will stores the address of the first byte of our
string (/bin/sh) on the stack and calls the shellcode
then pop rsi, will put the value of the address of our string into rsi

i dont put "db '/bin/shJAAAAKKKK" in the .data section and i don’t label it
because i want to use relative addressing

The book tells me to follow these steps

1. Fill EAX with nulls by xoring EAX with itself.
2. Terminate our /bin/sh string by copying AL over the last byte of the
string.
3. Get the address of the beginning of the string, which is stored in RSI,
and copy that value into EBX.
4. Copy the value stored in EBX, now the address of the beginning of the
string, over the AAAA placeholders. This is the argument pointer to the
binary to be executed, which is required by execve. Again, you need to
calculate the offset.
5. Copy the nulls still stored in EAX over the KKKK placeholders, using the
correct offset.
6. EAX no longer needs to be filled with nulls, so copy the value of our
execve syscall (0x0b) into AL.
7. Load EBX with the address of our string.
8. Load the address of the value stored in the AAAA placeholder, which is a
pointer to our string, into ECX.
9. Load up EDX with the address of the value in KKKK, a pointer to null.
10. Execute int 0x80.

#4
RhetoricalRuvim

RhetoricalRuvim

    JavaScript Programmer

  • Members
  • PipPipPipPipPipPipPipPip
  • 1,251 posts
  • Location:C:\Countries\US
So why not use system call 1 (exit), right after you use int 0x80? As far as I understand, int 0x80 is a Linux system call. So right before the GotoCall label, try inserting something like this:
xor ebx, ebx    ;; clear ebx 

mov eax, 1    ;; Linux exit system call number 

int 0x80 
And I am not totally sure, maybe you can read data from the code segment, but you might get a fault if you try to modify the code segment, so I am not totally sure about the idea of having strings in the code segment; although you can try it this way and then if it doesn't work you can try the other way. And I think there's probably some keyword for YASM to use relative addressing.

#5
Renato_Motta

Renato_Motta

    Newbie

  • Members
  • Pip
  • 4 posts
but that does not explain why i`m getting a seg fault error =/
when i debug the program it tells me that mov %al,0x7(%esi)
is wrong...

Program received signal SIGSEGV, Segmentation fault.

0x0000000000400085 in _start ()

(gdb) disas _start

Dump of assembler code for function _start:

0x0000000000400080 <_start+0>:	jmp    0x4000a2 <_start+34>

0x0000000000400082 <_start+2>:	pop    %rsi

0x0000000000400083 <_start+3>:	xor    %eax,%eax

# This line

->0x0000000000400085 <_start+5>:	addr32 mov %al,0x7(%esi)

0x0000000000400089 <_start+9>:	addr32 lea (%esi),%ebx

0x000000000040008c <_start+12>:	addr32 mov %ebx,0x8(%esi)

0x0000000000400090 <_start+16>:	addr32 mov %eax,0xc(%esi)

0x0000000000400094 <_start+20>:	mov    $0xb,%al

0x0000000000400096 <_start+22>:	mov    %esi,%ebx

0x0000000000400098 <_start+24>:	addr32 lea 0x8(%esi),%ecx

0x000000000040009c <_start+28>:	addr32 lea 0xc(%esi),%edx

0x00000000004000a0 <_start+32>:	int    $0x80

0x00000000004000a2 <_start+34>:	callq  0x400082 <_start+2>

0x00000000004000a7 <_start+39>:	(bad)  

0x00000000004000a8 <_start+40>:	(bad)  

0x00000000004000a9 <_start+41>:	imul   $0x414a6873,0x2f(%rsi),%ebp

0x00000000004000b0 <_start+48>:	rex.B

0x00000000004000b1 <_start+49>:	rex.B

0x00000000004000b2 <_start+50>:	rex.B

0x00000000004000b3 <_start+51>:	rex.WXB

0x00000000004000b4 <_start+52>:	rex.WXB

0x00000000004000b5 <_start+53>:	rex.WXB

0x00000000004000b6 <_start+54>:	rex.WXB add    %bpl,(%r14)

End of assembler dump.



#6
RhetoricalRuvim

RhetoricalRuvim

    JavaScript Programmer

  • Members
  • PipPipPipPipPipPipPipPip
  • 1,251 posts
  • Location:C:\Countries\US
So I still don't really get what you're doing there. If you use "pop rsi" then doesn't it mean that you're in 64-bit mode? But then you use 32-bit registers for addressing? I don't know if that's the problem, but it might have something to do with the string being in the code segment, maybe?

#7
RhetoricalRuvim

RhetoricalRuvim

    JavaScript Programmer

  • Members
  • PipPipPipPipPipPipPipPip
  • 1,251 posts
  • Location:C:\Countries\US
But that doesn't make too much sense, though; it worked when I tried something like this (this is MASM32):
.386 

.model flat, stdcall 

option casemap:none 

include \RS\include\ifiles.inc 

.data 

.data? 

.code 

start: 


jmp some_string 


print_some_string: 

pop eax ;; address of some_string 

push eax 

call StdOut 


push eax 

call ExitProcess 


some_string: 

call print_some_string 

db "Hello, how are you? ", 0 


end start 

And, by the way, the "\RS\include\ifiles.inc" file is a file that was made by me and that has nothing more than just a few "include" statements that I don't want to type every time I want to make a program and a few macros that I made for myself.

#8
RhetoricalRuvim

RhetoricalRuvim

    JavaScript Programmer

  • Members
  • PipPipPipPipPipPipPipPip
  • 1,251 posts
  • Location:C:\Countries\US
Oh, I think I got it. You can't edit things that are in the code segment. If you want to put a '\0' byte after the string then you have to use a data segment. I tried the following and it generated an error:
.386 

.model flat, stdcall 

option casemap:none 

include \RS\include\ifiles.inc 

.data 

.data? 

.code 

start: 


jmp some_string 


print_some_string: 

pop eax ;; address of some_string 

mov ebx, eax 

mov byte ptr [ebx+20], 0 

push eax 

call StdOut 


push eax 

call ExitProcess 


some_string: 

call print_some_string 

db "Hello, how are you? ", 0 


end start 


#9
Renato_Motta

Renato_Motta

    Newbie

  • Members
  • Pip
  • 4 posts
Your right that's true,
... so should i just put
db '/bin/shJAAAAKKKK' in the .data segment?
i wonder why the book tells me to do so in the code segment

Thanks so much for all your help btw =)

oh.. another thing... do you know any good books about this?
or maybe a website you can refer me too, preferably for x64.
Does not have to be about shellcode just something that can explain to me things like this.
Again thanks for helping me.

#10
RhetoricalRuvim

RhetoricalRuvim

    JavaScript Programmer

  • Members
  • PipPipPipPipPipPipPipPip
  • 1,251 posts
  • Location:C:\Countries\US
Maybe it's because the book assumes that you do have the access rights like that for the code segment; I think it would work in real-address mode, but a lot of programs, these days, are designed for protected mode. I'm pretty sure it's possible with protected mode, also, but operating systems like Windows usually set everything up so you don't tamper with the code segment.

About references, the Intel manuals are fairly good reference. There are older Intel processor manuals, such as the Intel Architecture Software Developer's Manual Volume 1, and there are also newer ones. There's also this book called The Peter Norton Programmer's Guide To The IBM PC (which is also known as the pink shirt book), which is copyright 1985 by Peter Norton, but if you want to learn more about the PC then you could get that book. The Intel manuals explain a lot (about protected mode and other things like this), though.

Intel Architecture Software Developer's Manual, Volume 1: Basic Architecture
Intel Architecture Software Developer's Manual, Volume 2: Instruction Set Reference Manual
Intel Architecture Software Developer's Manual Volume 3: System Programming
Intel® 64 and IA-32 Architectures Software Developer's Manuals




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users