Jump to content


Check out our Community Blogs

Register and join over 40,000 other developers!


Recent Status Updates

View All Updates

Photo
- - - - -

Intro To Win32 Assembly, Using NASM, Part 2

assembly

  • Please log in to reply
2 replies to this topic

#1 RhetoricalRuvim

RhetoricalRuvim

    JavaScript Programmer

  • Expert Member
  • PipPipPipPipPipPipPip
  • 1307 posts
  • Location:C:\Countries\US
  • Programming Language:C, Java, C++, PHP, Python, JavaScript

Posted 13 August 2011 - 03:04 PM

Common Instructions
common_instructions_cc.jpg

The Registers

General-Purpose Registers
EAX - Accumulator Register
EBX - Base/Address Register
ECX - Count Register
EDX - Data Register
ESI - Source Index
EDI - Destination Index
ESP - Stack Pointer
EBP - Base Pointer
Status/Control Registers
Segment Registers (limited access):
CS - Code Segment
DS - Data Segment
SS - Stack Segment
ES - Extra Segment
FS - (Doesn't Really Stand For Anything) Segment
GS - (Doesn't Really Stand For Anything) Segment
Control Registers (limited access):
CR0
CR2
CR3
CR4
Debug Registers (limited access):
DR0
DR1
DR2
DR3
Other Registers (no direct access):
EIP - Instruction Pointer
EFLAGS - Flags Register (more on this later)


Instruction Pointer Register - What Is This EIP?
There are several registers in the Intel processor; some of them are general-purpose registers, and some are status and/or control registers. EIP is one of the latter. In Intel 8086 (16-bit), it's used to be called IP, but starting from Intel 386 (32-bit), it's called EIP. EIP is the instruction pointer register; it points to the next instruction to execute.
But one thing to know is that you can't change this register directly.

Flags Register - What Are Flags?
Flags are just bits that indicate true (if set) or false (if clear). The EFLAGS (or FLAGS, in 8086) register contains flags for different purposes. Also, a lot of the instructions modify flags. Flags are also used for tests and comparisons (ie "if" statements).

Instruction Set Reference
If you need reference for any instruction, you can perform a Google-search for "<instruction_name> intel instruction" (without the quotes), and go to the page that looks most relevant.

You can also look at this page, for reference to some common instructions.

For a complete reference of the Intel instruction set, refer to the Intel Architecture Software Developer's Manual volume 2, Instruction Set Reference (Document Download Page).

Assembly Language - Instruction Usage Format
The format of instruction usage for assembly language is as follows:
<label>: <mnemonic> <operand1>, <operand2>, <operand3> ; <comment>

An Intel instruction can have 0 to 3 operands. As you can see, there are 3 parts: label, instruction, comment. You can have only the label, or only the comment, or only the instruction, or a combination of the three - so long as they are in order (ie the label comes before the instruction) and there's only one of each (no more than one label, no more than one instruction, etc.).

EAX and AX, EBX and BX, Etc. - Register Parts
EAX is a double-word sized register. AX is the lower-order word of EAX. When we look at a register, the low-order part of it is on the right, while the high-order part of it is on the left (this information helps with using the SHR and SHL instructions).
AL is the low-order byte of AX, and AH is the high-order byte of AX.
It's not that easy to access the high-order word of EAx, though.
Same goes for EBX, ECX, and EDX. Low-order word of EBX is BX, and so on.

The above only applies for the four registers EAX, EBX, ECX, and EDX.

What about the other four general-purpose registers?
- SI is the low-order word of ESI.
- DI is the low-order word of EDI.
- SP is the low-order word of ESP.
- BP is the low-order word of EBP.

Under Intel 8086 (16-bit), you only have the lower-word parts, and smaller (ie AX, AL, AH); you don't have the double-word registers (ie no EAX, no ESP, etc.).

Addressing Under 8086 - Effective Addresses
Under Intel 8086, you can only use the BX and BP registers for effective addressing.

The parts of an effective address (for 8086) are:
base + index + offset

Where base can be either BX or BP, index can be either SI or DI, and offset is an immediate value.

The following is not allowed:
mov ax, [cx]
mov ax, [bx+cx]

The following is allowed:
mov ax, [bx]
mov ax, [bx+si]
mov ax, [bx+di+8]
mov ax, [bp+si-4]

Addressing Under 386 - Effective Addresses
Under Intel 386, you can use any general-purpose register for memory references.

The format for an effective address is as follows:
base + (index * scale) + displacement

Where:
- base is any of the 8 general-purpose registers.
- index can be any of the 8 general-purpose registers except ESP.
- scale can be 1, 2, 4, or 8.
- displacement is an immediate value.

For more information about effective addressing, refer to the Intel Architecture Software Developer's Manual volume 1, Basic Architecture (Document Download Page).

Register Structure - Where Goes What?
The following is the structure of the EAX register, but same applies for EBX, ECX, and EDX:
eax_structure_cc.jpg

ESI, EDI, ESP, and EBP are similar, but they just don't have easily-accessible byte parts as the first four have (ie AL, AH, etc.).

Memory Storage Structure - Little-Endian Byte Order
The bits and bytes are ordered correctly when they're in the registers (such as EAX). But what about when they're stored in memory?

Intel uses little-endian byte ordering, which means that the least-significant byte comes first (as opposed to big-endian byte ordering, where the bytes are ordered in a storage medium in the right order).

When you save EAX, for example, to a memory location, let's say 32, AL is saved to 32, AH to 33, and the rest of EAX to 34. When you save AX to 32, AL is still saved to 32, and AH is still saved to 33; that is, in a way, a nice thing, because what if you want to just get the lower-order word of the integer, you just use AX, instead of EAX, and the effective address still stays the same.

Intel Architecture - The Stack
The ESP register contains the memory address of the current stack.

The last thing pushed to the stack is the first thing to be popped off the stack.

One thing to note, though, is that the stack grows down, instead of growing up. So if you push two bytes to the stack, the stack pointer will decrease by two. And then if you pop four bytes off the stack, the stack pointer will increase by four.

Programming Under Windows - Subsystems
There are two major subsystems for Windows programs.
If the program's subsystem is "Console", a console window will appear, or the program would use the current command prompt console window (if started from command prompt), when the program starts.
Otherwise, if the program's subsystem is "Windows", no console window will appear. The type of programs we'll make use the windows subsystem, so we won't start out with a console window.
But we can still ask Windows for a console, if we want one, by using the Win32 API AllocConsole() function; we will, however, have to tell Windows when we're done using the console, with the FreeConsole() function.

An example of a console subsystem:
console_program_cc.PNG








First Tutorial:
Part 1

Previous Tutorial:
Part 1

Next Tutorial:
Part 3

Edited by RhetoricalRuvim, 20 August 2011 - 07:24 PM.

  • 1

#2 John

John

    CC Mentor

  • Moderator
  • 4450 posts
  • Location:New York, NY

Posted 16 August 2011 - 01:53 PM

Are there 64bit registers? If so, how are they accessed?
  • 0

#3 RhetoricalRuvim

RhetoricalRuvim

    JavaScript Programmer

  • Expert Member
  • PipPipPipPipPipPipPip
  • 1307 posts
  • Location:C:\Countries\US
  • Programming Language:C, Java, C++, PHP, Python, JavaScript

Posted 17 August 2011 - 01:30 PM

The idea, for now, is 32-bit Windows program (hence, it's Win32 Assembly, with NASM).

But yes, there are 64-bit registers. The general-purpose register names start with 'r', as opposed to starting with 'e' (ie accumulator register is RAX, instead of EAX), and their size is 64 bits. 32-bit registers (ie EAX), 16-bit registers (ie AX), and 8-bit registers (ie AL, AH, etc.) can still be accessed from a 64-bit environment.

There are also R8 through R15 registers, besides the 8 original registers, in the 64-bit Intel architecture.

And 64-bit processors would generally have a bigger address bus, so it's possible to use more of the computer's memory, if there is more memory available.
  • 0





Also tagged with one or more of these keywords: assembly