Jump to content


Check out our Community Blogs

RhetoricalRuvim

Member Since 16 Sep 2010
Offline Last Active Oct 25 2017 03:39 PM
-----

#615143 Partitioning Your Hard Drive for Linux

Posted by RhetoricalRuvim on 24 November 2011 - 04:44 PM

From what I understood from the pink shirt book, a partition is an area on a disk, reserved for an operating system. It would make sense if the definition I'm talking about is somewhat old (the book was published in 1985 by Microsoft Press), as the book does talk a LOT about DOS, FAT12 and FAT16, BIOS, etc.

You don't have to answer this if you don't know, but I'm wondering if there's a way to use LBA addressing to access disk drives with BIOS service interrupts.
  • 1


#614806 How does a kernel identify which process is calling?

Posted by RhetoricalRuvim on 21 November 2011 - 02:54 PM

At a guess, it would either track which process/thread is currently being executed by the current processor, or it could probably use the interrupt return far address pointer and check to which process's address space the address points.

On the operating system I'm planning to write (some day; ...), the latter is not really an option, because I want the currently-executing process to be loaded into a specific application address space, and unloaded before loading the next process in the to-be-executed thread list; I think that is a better way to manage memory. That way is not the fastest method, but it's not the slowest either.
  • 1


#614735 Windows API

Posted by RhetoricalRuvim on 20 November 2011 - 03:24 PM

Here's a segment from a program that I have written before:
invoke SetWindowLong, eax, GWL_STYLE, WS_MAXIMIZE 
	
	invoke ShowWindow, [hwnd], SHOW_FULLSCREEN 
	invoke UpdateWindow, [hwnd]

I no longer use the invoke syntax, as I have gone to doing things manually, but it's the same idea.

So the hWnd is obviously the handle to the window you just made, the nIndex is GWL_STYLE, and dwNewLong is WS_MAXIMIZE.

Also, instead of passing SW_SHOWDEFAULT, etc., to ShowWindow (), you pass SHOW_FULLSCREEN to it.
  • -1


#614101 Drawing A 3D Object?

Posted by RhetoricalRuvim on 14 November 2011 - 08:10 PM

Hello.

I have a program that is supposed to render 3D objects, using 2D drawing tools. The 3D objects are stored as meshes (points & faces type). The program converts all the 3D points into 2D points, and then draws the faces as polygons, in order from most to least distant faces.


Here's an example 3D box:
Posted Image
(link)





Here's the definition of the above box:
<object> 
			<points> 
				vector= figure1 
				1 (0, 0, 0) 
				2 (0, 0, 100) 
				3 (800, 0, 100) 
				4 (800, 0, 0) 
				5 (0, 500, 0) 
				6 (0, 500, 100) 
				7 (800, 500, 100) 
				8 (800, 500, 0) 
			</points> 
			<faces> 
				1, 2, 3, 4 = #FF0000 
				and 
				5, 6, 7, 8 = #0000FF 
				and 
				2, 6, 7, 3 = #00FF00 
				and 
				1, 2, 6, 5 = #FF00FF 
				and 
				3, 4, 8, 7 = #FFFF00 
				and 
				1, 4, 8, 5 = #00FFFF 
			</faces> 
		</object>





The program averages the z value of the points of each face, and orders the faces based on that. The problem, however, is that sometimes the average is not quite the best thing to do; for example, take a look at this:
Posted Image
(link)




It's the same box, just from another angle and position. The average of the points of the face on the right is closer to the camera than that of the face on top.


So I wanted to ask, do you guys have any ideas for how to figure out which face comes before which?


Thanks.
RR
  • -1


#613076 Newb assembly questions

Posted by RhetoricalRuvim on 04 November 2011 - 04:30 PM

I like NASM, as for the assemblers part.



ORG means "where, in memory, the program would be loaded," relative to the current base segment address. In 8086, and 8088, the base address = segment register value * 16



As for which registers to use where, the best register to use for pointers would be BX (8086 or 8088 - also a good habit, I think, on newer processors, but it's your choice for those).

This is because BX and BP are the registers that are most flexible about effective addressing. Related to this, the best index registers are SI and DI, because of the fact that they're the only ones that can be used as indexes in effective addresses (8086 or 8088).

The best data transfer/processing/arithmetic register(/s) would be the accumulator (AX/AL/AH).

So 'mov ax, [bx+si]' and 'mov ax, [bx+di]' and 'mov ax, [bp+si]' and etc., are all valid. You can also add immediate offsets to those register-using effective addresses, as in 'mov ax, [bx+si+8]' , etc.



About interrupts, the processor executes the currently-running program, and then something happens outside the processor that "interrupts" the processor from the program, and makes the processor go and see what's going on outside. Each piece of outside hardware has an interrupt number, and when that piece of hardware changes state or something, the processor goes to the interrupt reference table (in the 8086/8088 case the Interrupt Vector Table, or the IVT), and checks the entry for that interrupt number. The entry consists of an offset (for IVT, 16-bit address) and a segment (for IVT, 16-bit segment value; this * 16 + offset is where the processor would go to, to execute the interrupt-handling code), respectively.

Newer processors have this Interrupt Descriptor Table (IDT), which works differently, is more complex, and deals a lot with pointers. Despite all the complexity of the 386 and newer processors, however, the 32-bit registers, and register-use flexibility, would probably make up for all this complicated stuff.
  • -1


#612588 How to fix code to read input from textbox in VB.NET?

Posted by RhetoricalRuvim on 29 October 2011 - 04:20 PM

Try 'MsgBox (number)' or whatever the syntax is for displaying what 'number' currently is, right before the 'if' statement.
  • -1


#607881 C# - Enums

Posted by RhetoricalRuvim on 20 August 2011 - 06:45 PM

Think of it like this; There are numbers in the enum that have names assigned to them, if we get the input "2" our "dir" value is 2. Now the name correlating with the value 2 is? West!


But you said 2 is South.

Though it is more understandable now, I would say. So the user types a number, and not a word; that makes more sense.
  • 1


#607798 Assembly, Using Memory Allocation and Alphabet Algorithm (Win32, NASM)

Posted by RhetoricalRuvim on 19 August 2011 - 06:05 PM

In this tutorial we'll look at two more Win32 API functions and, after that, make another algorithm function that would use one of those Win32 API functions. In other words, we'll learn about more Win32 API functions, after what we'll practice using what we learned.

Overview
  • Two New Win32 API Functions
  • A New Algorithm
  • Example Program




Two New Win32 API Functions
When a program accesses memory from either the code, data, or bss section, it doesn't use the actual memory address; programs use this type of address called a Relative Virtual Address (RVA). The processor then uses page tables and segmentation to translate that relative virtual address into a physical memory address.

When a program is loaded into memory, it's loaded to a specific virtual address.



What Is A Virtual Address?
A virtual address is used by programs just like a regular memory address. Then there are these things called page tables, into which the processor looks to see where the actual memory address is.

Let's say, for example, that the program is trying to access memory address 0x823. The page table for our program, that the operating system made, looks something like this:
Page Size: 64
Entries:
at physical address 0x500
not present
not present
at physical address 0x18000
not present
at physical address 0x12000
not present
not present
not present
.....
(this is entry number 32, now)
at physical address 0x24000
.....

Once again, I never worked with page tables before, but I read a little bit about them, so this is sort of what the actual table looks like. For reference to the Intel page tables, read the Intel manual, system programming.

The processor would then find entry number 32, because that's where 0x800 starts from. Then it would add 0x23 and 0x24000 to get the physical memory address to access.

If the program tries accessing another memory location, for which the page table is marked as not present, the processor would notify the operating system about the issue and the operating system would load the correct page into memory - usually from the hard drive or another storage device - and then direct the processor back to the program, so that it can continue running.
When the operating system loads the desired page into memory, it also usually stores an unused page to the storage device. That way, even if the computer only has 2 GB of RAM, the program would still be able to access 4 GB of virtual memory.

So yeah, a virtual address is basically an address that's used for addressing, while the processor and the operating system take care of the actual physical memory and storage management.


The RVA is the virtual address, relative to where the program has been loaded; it could also be relative to where the program has been loaded + some number, depending on how the linker sets things up in the PE file's headers.

But when you use the bss section, you have to determine how much memory you need when you write the program. Shat if you end up needing more memory than you though you would need? Or what if you end up needing less memory, so you'll be taking up more resources than you need?

In cases like this, we can ask Windows for more memory. But, you need to remember, when you're done using the memory - just like when you're done using a file - you have to tell Windows about that.

You wouldn't be very happy if someone borrowed something from you and never brought it back, would you?

So it's something like "Windows, can I borrow 4 more KB of memory? ... Okay, Windows, I'm done using that, you can take it back now."

GlobalAlloc - Asking Windows For More (Global) Memory
GlobalAlloc() has the following parameters:
  • Flags
  • Number of bytes to allocate (borrow)

The return value is a pointer to the newly-allocated memory.

For more reference to the GlobalAlloc() Win32 API function, visit GlobalAlloc Function (Windows).

GlobalFree - Telling Windows That You're Done Using The Allocated (Global) Memory
GlobalFree() has only one parameter:
  • The pointer to the memory location to free.

For more reference to the GlobalFree() Win32 API function, visit GlobalFree Function (Windows).





A New Algorithm
We'll make an alphabet algorithm, now.

Before finding out what the algorithm does, it might help to know its parameters:
  • Starting letter.
  • Letter to stop after.
  • Whether or not to put spaces between every letter.

This algorithm returns a (pointer to a) string with letters of the alphabet in it, starting from the letter parameter1, stopping after the letter parameter2; if parameter3 is non-zero, also put spaces between every letter in the string.

Keeping track of the current letter is a big part of the algorithm.

The returned string is contained in a buffer allocated with the GlobalAlloc() Win32 API function, so you should free the pointer to the string, when you no longer need it, using the GlobalFree() Win32 API function.

I'll explain more details using comments, in the code. Here's the code for the abc() function:
;; abc() - returns a string with the alphabet. 
;; parameters: 
;;  	the letter to start from 
;;  	the letter to stop after 
;;  	whether to seperate each letter from another letter with a space 
;; return value: 
;;  	the pointer to the new string with the letters 
abc: 
	enter 4, 0 
	push dword 0 
	push ebx                          ;; Save EBX. 
	
	;; First of all, we need to run a scan of the letters, without saving the values. 
	xor ebx, ebx                      ;; This should tell the loop not to save the values. 
	call .the_loop                    ;; Call the letter scan loop. 
	
	;; The total size of the new string should be equal to the value returned 
	;; by .the_loop in ECX. 
	;; We need to save that number. 
	mov eax, ecx 
	mov dword [ebp-8], eax 
	
	;; Now we need to ask Windows to allocate some memory for us. 
	;mov eax, dword [ebp-8]           ;; EAX is already [ebp-8]. 
	inc eax                           ;; We increment that number, because we'll need an extra byte for the NULL terminator. 
	push eax 
	push dword 0                      ;; No flags, for now. 
	call [GlobalAlloc]                ;; Call the Windows global memory allocation API function. 
	mov dword [ebp-4], eax            ;; Save the pointer that GlobalAlloc() returned. 
	
	mov ebx, eax                      ;; Also use that pointer for the .the_loop function. 
	call .the_loop                    ;; Now we run the loop and save everything to the new string. 
	
	mov eax, dword [ebp-4]            ;; We would return the pointer to the new string. 
	
	jmp .finish                       ;; We'll need space to define our nested function, 
	;; so we'll have to jump over the nested function 
	;; to the .finish label. 
	
	.the_loop: 
		;; EBX is the pointer to the buffer for the new string. 
		
		push edi                      ;; Save EDI. 
		
		xor ecx, ecx                  ;; We're supposed to be counting how many characters the new string would have. 
		;; We start counting from 0, for now. 
		
		mov eax, dword [ebp+12]       ;; Get the letter to stop at. 
		cmp eax, 122 
		jng .the_loop_over1 
			;; If the letter to stop after is greater than 'z', set it to 'z'. 
			mov eax, 122 
		.the_loop_over1: 
		cmp eax, 65 
		jnl .the_loop_over2 
			;; If the letter to stop after is less than 'A', set it to 'A'. 
			mov eax, 65 
		.the_loop_over2: 
		mov edi, eax                  ;; Save that letter in EDI. 
		
		mov eax, dword [ebp+08]       ;; Get the letter to start from. 
		mov edx, eax 
		
		.the_loop1: 
			;; Check if it's time to stop the loop yet. 
			cmp edx, edi              ;; Compare the current letter to the letter to stop after. 
			jg .the_loop1s            ;; If the current letter is greater, break the loop. 
			
			cmp edx, 65 
			jnl .the_loop1over1 
				;; If the current letter is less than 'A', set it to 'A'. 
				mov edx, 65 
			.the_loop1over1: 
			
			cmp edx, 122 
			jng .the_loop1over2 
				;; If the current letter is greater than 'z', reset it to 'A'. 
				mov edx, 65 
			.the_loop1over2: 
			
			cmp edx, 97 
			jnl .the_loop1over2b 
				cmp edx, 90 
				jng .the_loop1over2b 
					;; If the current letter is less than 'a' and greater than 'Z' (not a letter character), 
					;; set the current letter to 'a'. 
					mov edx, 97 
			.the_loop1over2b: 
			
			cmp ebx, 0 
			jz .the_loop1over3 
				;; If EBX is not a NULL pointer (meaning if we're supposed to save the output), 
				;; do the following: 
				
				;; Get the current letter. 
				mov eax, edx 
				
				;; Save the current character. 
				mov byte [ebx], al 
				
				;; Increment the pointer. 
				inc ebx 
				
				cmp dword [ebp+16], 0 ;; Check if the third parameter is FALSE. 
				jz .the_loop1over3    ;; If so, skip over the space-adding part of the code. 
				
				;; Otherwise, save a space to where the pointer is pointing. 
				mov byte [ebx], 32 
				
				;; And increment the pointer. 
				inc ebx 
			.the_loop1over3: 
			
			cmp dword [ebp+16], 0     ;; Check if the third parameter is FALSE. 
			jz .the_loop1over4 
			
			;; If it's TRUE, increment the count an extra time. 
			inc ecx 
			
			.the_loop1over4: 
			
			;; In any case, we'll still need to increment the count. 
			inc ecx 
			
			;; Increment the current letter. 
			inc edx 
			
			;; Continue the loop. 
			jmp .the_loop1 
		.the_loop1s: 
		
		;; If put spaces between the letters. 
		cmp dword [ebp+16], 0 
		jz .the_loop_over3 
			;; Since the last letter doesn't need a following space, we'll decrement the character count. 
			dec ecx 
			
			;; If the pointer is NULL, then this next part of the code is to be skipped. 
			cmp ebx, 0 
			jz .the_loop_over4 
			
			;; But we still put a space for the last character, so we'll have to put a NULL to that position. 
			mov byte [ebx-1], 0 
			
			;; .the_loop_over3 is for not-putting-spaces code. 
			jmp .the_loop_over4 
		.the_loop_over3: 
			;; If the pointer is NULL, then this next part of the code is to be skipped. 
			cmp ebx, 0 
			jz .the_loop_over4 
			
			;; Save a NULL character at [pointer]. 
			mov byte [ebx], 0 
		.the_loop_over4: 
		
		pop edi                       ;; Restore EDI. 
	ret 0 
	
	.finish: 
	
	pop ebx                           ;; Restore EBX. 
	leave 
ret 12






Example Program


Example Program - The Idea
  • Display a message box with the alphabet 'A' through 'z' (meaning, first 'A' through 'Z' and then 'a' through 'z'); no spaces between letters.
  • Display a message box with the alphabet 'F' through 'P'; no spaces between letters.
  • Display a message box with the alphabet 'a' through 'z'; put spaces between letters.

Example Program - The Code
;; Define the externs. 
extern MessageBoxA 
extern ExitProcess 
extern GlobalAlloc 
extern GlobalFree 

;; Construct our symbol import table. 
import MessageBoxA user32.dll 
import ExitProcess kernel32.dll 
import GlobalAlloc kernel32.dll 
import GlobalFree kernel32.dll 

;; This is the code section; use 32-bit code. 
section .text use32 
;; Start execution here. 
..start: 

;; Call the main() function. 
call main 

;; Exit, returning whatever main() retured. 
push eax 
call [ExitProcess] 

main: 
	enter 4, 0 
	
	push dword 0                      ;; Don't put spaces between every letter. 
	push dword 122                    ;; Stop after 'z'. 
	push dword 65                     ;; Start at 'A'. 
	call abc 
	mov dword [ebp-4], eax            ;; Save the pointer to the string. 
	
	;; Display a message box with the new string. 
	push dword 0 
	push dword the_title 
	push dword [ebp-4] 
	push dword 0 
	call [MessageBoxA] 
	
	;; Free the buffer for the string. 
	push dword [ebp-4] 
	call [GlobalFree] 
	
	push dword 0                      ;; Don't put spaces between every letter. 
	push dword 80                     ;; Stop after 'P'. 
	push dword 70                     ;; Start at 'F'. 
	call abc 
	mov dword [ebp-4], eax            ;; Save the pointer to the string. 
	
	;; Display a message box with the new string. 
	push dword 0 
	push dword the_title 
	push eax                          ;; Since EAX is already equal to [ebp-4], why not just use EAX? 
	push dword 0 
	call [MessageBoxA] 
	
	;; Free the buffer. 
	push dword [ebp-4] 
	call [GlobalFree] 
	
	push dword 1                      ;; Put spaces between every letter. 
	push dword 122                    ;; Stop at 'z'. 
	push dword 97                     ;; Start at 'a'. 
	call abc 
	mov dword [ebp-4], eax            ;; Save the pointer to the string. 
	
	;; Display a message box with the new string. 
	push dword 0 
	push dword the_title 
	push dword [ebp-4] 
	push dword 0 
	call [MessageBoxA] 
	
	;; Free the string buffer. 
	push dword [ebp-4] 
	call [GlobalFree] 
	
	;; Return 0. 
	xor eax, eax 
	leave 
ret 

;; abc() - returns a string with the alphabet. 
;; parameters: 
;;  	the letter to start from 
;;  	the letter to stop after 
;;  	whether to seperate each letter from another letter with a space 
;; return value: 
;;  	the pointer to the new string with the letters 
abc: 
	enter 4, 0 
	push dword 0 
	push ebx                          ;; Save EBX. 
	
	;; First of all, we need to run a scan of the letters, without saving the values. 
	xor ebx, ebx                      ;; This should tell the loop not to save the values. 
	call .the_loop                    ;; Call the letter scan loop. 
	
	;; The total size of the new string should be equal to the value returned 
	;; by .the_loop in ECX. 
	;; We need to save that number. 
	mov eax, ecx 
	mov dword [ebp-8], eax 
	
	;; Now we need to ask Windows to allocate some memory for us. 
	;mov eax, dword [ebp-8]           ;; EAX is already [ebp-8]. 
	inc eax                           ;; We increment that number, because we'll need an extra byte for the NULL terminator. 
	push eax 
	push dword 0                      ;; No flags, for now. 
	call [GlobalAlloc]                ;; Call the Windows global memory allocation API function. 
	mov dword [ebp-4], eax            ;; Save the pointer that GlobalAlloc() returned. 
	
	mov ebx, eax                      ;; Also use that pointer for the .the_loop function. 
	call .the_loop                    ;; Now we run the loop and save everything to the new string. 
	
	mov eax, dword [ebp-4]            ;; We would return the pointer to the new string. 
	
	jmp .finish                       ;; We'll need space to define our nested function, 
	;; so we'll have to jump over the nested function 
	;; to the .finish label. 
	
	.the_loop: 
		;; EBX is the pointer to the buffer for the new string. 
		
		push edi                      ;; Save EDI. 
		
		xor ecx, ecx                  ;; We're supposed to be counting how many characters the new string would have. 
		;; We start counting from 0, for now. 
		
		mov eax, dword [ebp+12]       ;; Get the letter to stop at. 
		cmp eax, 122 
		jng .the_loop_over1 
			;; If the letter to stop after is greater than 'z', set it to 'z'. 
			mov eax, 122 
		.the_loop_over1: 
		cmp eax, 65 
		jnl .the_loop_over2 
			;; If the letter to stop after is less than 'A', set it to 'A'. 
			mov eax, 65 
		.the_loop_over2: 
		mov edi, eax                  ;; Save that letter in EDI. 
		
		mov eax, dword [ebp+08]       ;; Get the letter to start from. 
		mov edx, eax 
		
		.the_loop1: 
			;; Check if it's time to stop the loop yet. 
			cmp edx, edi              ;; Compare the current letter to the letter to stop after. 
			jg .the_loop1s            ;; If the current letter is greater, break the loop. 
			
			cmp edx, 65 
			jnl .the_loop1over1 
				;; If the current letter is less than 'A', set it to 'A'. 
				mov edx, 65 
			.the_loop1over1: 
			
			cmp edx, 122 
			jng .the_loop1over2 
				;; If the current letter is greater than 'z', reset it to 'A'. 
				mov edx, 65 
			.the_loop1over2: 
			
			cmp edx, 97 
			jnl .the_loop1over2b 
				cmp edx, 90 
				jng .the_loop1over2b 
					;; If the current letter is less than 'a' and greater than 'Z' (not a letter character), 
					;; set the current letter to 'a'. 
					mov edx, 97 
			.the_loop1over2b: 
			
			cmp ebx, 0 
			jz .the_loop1over3 
				;; If EBX is not a NULL pointer (meaning if we're supposed to save the output), 
				;; do the following: 
				
				;; Get the current letter. 
				mov eax, edx 
				
				;; Save the current character. 
				mov byte [ebx], al 
				
				;; Increment the pointer. 
				inc ebx 
				
				cmp dword [ebp+16], 0 ;; Check if the third parameter is FALSE. 
				jz .the_loop1over3    ;; If so, skip over the space-adding part of the code. 
				
				;; Otherwise, save a space to where the pointer is pointing. 
				mov byte [ebx], 32 
				
				;; And increment the pointer. 
				inc ebx 
			.the_loop1over3: 
			
			cmp dword [ebp+16], 0     ;; Check if the third parameter is FALSE. 
			jz .the_loop1over4 
			
			;; If it's TRUE, increment the count an extra time. 
			inc ecx 
			
			.the_loop1over4: 
			
			;; In any case, we'll still need to increment the count. 
			inc ecx 
			
			;; Increment the current letter. 
			inc edx 
			
			;; Continue the loop. 
			jmp .the_loop1 
		.the_loop1s: 
		
		;; If put spaces between the letters. 
		cmp dword [ebp+16], 0 
		jz .the_loop_over3 
			;; Since the last letter doesn't need a following space, we'll decrement the character count. 
			dec ecx 
			
			;; If the pointer is NULL, then this next part of the code is to be skipped. 
			cmp ebx, 0 
			jz .the_loop_over4 
			
			;; But we still put a space for the last character, so we'll have to put a NULL to that position. 
			mov byte [ebx-1], 0 
			
			;; .the_loop_over3 is for not-putting-spaces code. 
			jmp .the_loop_over4 
		.the_loop_over3: 
			;; If the pointer is NULL, then this next part of the code is to be skipped. 
			cmp ebx, 0 
			jz .the_loop_over4 
			
			;; Save a NULL character at [pointer]. 
			mov byte [ebx], 0 
		.the_loop_over4: 
		
		pop edi                       ;; Restore EDI. 
	ret 0 
	
	.finish: 
	
	pop ebx                           ;; Restore EBX. 
	leave 
ret 12 

;; The data section. 
section .data 
the_title                                             db "Memory Alphabet Example", 0 

;; We don't have to define every section in our source code; NASM would do the defining even if we don't.

Example Program - The Output
You should get three message boxes in a row, for the output. The message boxes should say:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
FGHIJKLMNOP
a b c d e f g h i j k l m n o p q r s t u v w x y z

Here's a screenshot of the third message box:
http://forum.codecal...tachmentid=4153











First Tutorial:
Intro To Win32 Assembly, Using NASM

Previous Tutorial:
File I/O and 'incbin'

Next Tutorial:
Handling Bugs In Your Programs






References:
Intel Manual System Programming Guide: IntelĀ® 64 and IA-32 Architectures Software Developer?s Manual Volume 3 (3A & 3B): System Programming Guide
GlobalAlloc Win32 API Function: GlobalAlloc Function (Windows)
GlobalFree Win32 API Function: GlobalFree Function (Windows)

Attached Thumbnails

  • MemoryAlphabet_output_cc.PNG

  • 1


#607210 Intro To Win32 Assembly, Using NASM, Part 2

Posted by RhetoricalRuvim on 13 August 2011 - 03:04 PM

Common Instructions
common_instructions_cc.jpg

The Registers

General-Purpose Registers
EAX - Accumulator Register
EBX - Base/Address Register
ECX - Count Register
EDX - Data Register
ESI - Source Index
EDI - Destination Index
ESP - Stack Pointer
EBP - Base Pointer
Status/Control Registers
Segment Registers (limited access):
CS - Code Segment
DS - Data Segment
SS - Stack Segment
ES - Extra Segment
FS - (Doesn't Really Stand For Anything) Segment
GS - (Doesn't Really Stand For Anything) Segment
Control Registers (limited access):
CR0
CR2
CR3
CR4
Debug Registers (limited access):
DR0
DR1
DR2
DR3
Other Registers (no direct access):
EIP - Instruction Pointer
EFLAGS - Flags Register (more on this later)


Instruction Pointer Register - What Is This EIP?
There are several registers in the Intel processor; some of them are general-purpose registers, and some are status and/or control registers. EIP is one of the latter. In Intel 8086 (16-bit), it's used to be called IP, but starting from Intel 386 (32-bit), it's called EIP. EIP is the instruction pointer register; it points to the next instruction to execute.
But one thing to know is that you can't change this register directly.

Flags Register - What Are Flags?
Flags are just bits that indicate true (if set) or false (if clear). The EFLAGS (or FLAGS, in 8086) register contains flags for different purposes. Also, a lot of the instructions modify flags. Flags are also used for tests and comparisons (ie "if" statements).

Instruction Set Reference
If you need reference for any instruction, you can perform a Google-search for "<instruction_name> intel instruction" (without the quotes), and go to the page that looks most relevant.

You can also look at this page, for reference to some common instructions.

For a complete reference of the Intel instruction set, refer to the Intel Architecture Software Developer's Manual volume 2, Instruction Set Reference (Document Download Page).

Assembly Language - Instruction Usage Format
The format of instruction usage for assembly language is as follows:
<label>: <mnemonic> <operand1>, <operand2>, <operand3> ; <comment>

An Intel instruction can have 0 to 3 operands. As you can see, there are 3 parts: label, instruction, comment. You can have only the label, or only the comment, or only the instruction, or a combination of the three - so long as they are in order (ie the label comes before the instruction) and there's only one of each (no more than one label, no more than one instruction, etc.).

EAX and AX, EBX and BX, Etc. - Register Parts
EAX is a double-word sized register. AX is the lower-order word of EAX. When we look at a register, the low-order part of it is on the right, while the high-order part of it is on the left (this information helps with using the SHR and SHL instructions).
AL is the low-order byte of AX, and AH is the high-order byte of AX.
It's not that easy to access the high-order word of EAx, though.
Same goes for EBX, ECX, and EDX. Low-order word of EBX is BX, and so on.

The above only applies for the four registers EAX, EBX, ECX, and EDX.

What about the other four general-purpose registers?
- SI is the low-order word of ESI.
- DI is the low-order word of EDI.
- SP is the low-order word of ESP.
- BP is the low-order word of EBP.

Under Intel 8086 (16-bit), you only have the lower-word parts, and smaller (ie AX, AL, AH); you don't have the double-word registers (ie no EAX, no ESP, etc.).

Addressing Under 8086 - Effective Addresses
Under Intel 8086, you can only use the BX and BP registers for effective addressing.

The parts of an effective address (for 8086) are:
base + index + offset

Where base can be either BX or BP, index can be either SI or DI, and offset is an immediate value.

The following is not allowed:
mov ax, [cx]
mov ax, [bx+cx]

The following is allowed:
mov ax, [bx]
mov ax, [bx+si]
mov ax, [bx+di+8]
mov ax, [bp+si-4]

Addressing Under 386 - Effective Addresses
Under Intel 386, you can use any general-purpose register for memory references.

The format for an effective address is as follows:
base + (index * scale) + displacement

Where:
- base is any of the 8 general-purpose registers.
- index can be any of the 8 general-purpose registers except ESP.
- scale can be 1, 2, 4, or 8.
- displacement is an immediate value.

For more information about effective addressing, refer to the Intel Architecture Software Developer's Manual volume 1, Basic Architecture (Document Download Page).

Register Structure - Where Goes What?
The following is the structure of the EAX register, but same applies for EBX, ECX, and EDX:
eax_structure_cc.jpg

ESI, EDI, ESP, and EBP are similar, but they just don't have easily-accessible byte parts as the first four have (ie AL, AH, etc.).

Memory Storage Structure - Little-Endian Byte Order
The bits and bytes are ordered correctly when they're in the registers (such as EAX). But what about when they're stored in memory?

Intel uses little-endian byte ordering, which means that the least-significant byte comes first (as opposed to big-endian byte ordering, where the bytes are ordered in a storage medium in the right order).

When you save EAX, for example, to a memory location, let's say 32, AL is saved to 32, AH to 33, and the rest of EAX to 34. When you save AX to 32, AL is still saved to 32, and AH is still saved to 33; that is, in a way, a nice thing, because what if you want to just get the lower-order word of the integer, you just use AX, instead of EAX, and the effective address still stays the same.

Intel Architecture - The Stack
The ESP register contains the memory address of the current stack.

The last thing pushed to the stack is the first thing to be popped off the stack.

One thing to note, though, is that the stack grows down, instead of growing up. So if you push two bytes to the stack, the stack pointer will decrease by two. And then if you pop four bytes off the stack, the stack pointer will increase by four.

Programming Under Windows - Subsystems
There are two major subsystems for Windows programs.
If the program's subsystem is "Console", a console window will appear, or the program would use the current command prompt console window (if started from command prompt), when the program starts.
Otherwise, if the program's subsystem is "Windows", no console window will appear. The type of programs we'll make use the windows subsystem, so we won't start out with a console window.
But we can still ask Windows for a console, if we want one, by using the Win32 API AllocConsole() function; we will, however, have to tell Windows when we're done using the console, with the FreeConsole() function.

An example of a console subsystem:
console_program_cc.PNG








First Tutorial:
Part 1

Previous Tutorial:
Part 1

Next Tutorial:
Part 3
  • 1


#607207 Intro To Win32 Assembly, Using NASM, Part 1

Posted by RhetoricalRuvim on 13 August 2011 - 02:15 PM

Introduction
There are a lot of ways to program a computer; one of them is by using assembly language, with the free NASM assembler, on the Windows platform.

One of the things I like about assembly language is that you actually get to think and program from the processor's perspective.


Getting Started

Materials - What Will You Need?
  • The NASM assembler (NASM)
  • The ALINK linker (ALINK)
  • A text editor - Notepad++ (Notepad++) is a good one

Note: When you install NASM and ALINK, it is preferred to install them to the C:\ directory, for easy access.

Windows Programs - What Are PE Files?
PE files are portable executable programs, that are intended for the Windows platform.
A PE file has headers, code, data, and idata sections; there are probably more, but we'll just go over the basics here.
- The code section of the executable is the area inside the file that contains the code for the program.
- The data section encloses the data of the program.
- The idata section has the symbol import tables.
- There's also the bss section, which is also for data, but it's not part of the actual file on the disk.

It's usually the headers, then the code, then the data, within the PE file. The bss section extends off of the executable, when it is in memory.

For a reference to the PE/COFF file format, you can read the Microsoft PE and COFF Specification (The Page).

The Idata Section - So What Is This Symbol Import Table?
Every Windows program uses Windows functions. That is, when a Windows program needs to, ie, read a file from a disk, it has to contact the operating system and "ask" it to do that.
Under Windows, the method of contacting the operating system is using Windows Application Programming Interface (API) (or Win32 API), which means (basically) Windows functions. To use Win32 API, it is necessary to import the functions that we are planning to use, from the correct Dynamic Link Library (DLL).


Dynamic Link Libraries - What Are DLLs?
A DLL is a file, similar to a PE, but it's not run as an ordinary program. A DLL is essentially a library of functions.

DLLs don't have to be just with Windows, programmers can make their own DLLs for their programs too.

Functions from DLLs can either be imported at run-time, or during program start-up (the time when the program is started).
To import a DLL at run-time, the program needs to use Win32 API to load the DLL and then use Win32 API to get the memory address of the function it needs to call.
To import a DLL on program start-up, the PE file needs to have the import table entries for the desired functions.


The import tables need to include the function names, along with the names of their container DLLs.

Part of the import section is the import address table. When the Windows PE loader loads our program into memory, it replaces entry in the import address table with the memory addresses of the corresponding Win32 API functions. So when we import a symbol, we access it by referring to it as if it was a variable.

Memory - What About Computer Memory?
Just about any computer has memory. Memory is kind of like the "workspace" of a computer.
When the computer does something, it uses memory. Before writing data to a disk, you have to have the data in memory. To execute code, it needs to be loaded into memory first.
To access memory, you need pointers. Those can be any values that are (in the 32-bit world) 4 bytes long.
The operating system (Windows, in our case) restricts the use of memory of our program to a designated address space, so that our program doesn't cause trouble at important memory locations.
An address space is a range of memory addresses. Each byte, in memory, has its own unique memory address.

So if we have a 16-byte memory chip (very unrealistic, but okay for an example), and the bytes in memory are:
12, 17, 84, 244, 123, 93, 83, 194,
19, 23, 94, 38, 45, 75, 243, 95


Then the byte at memory address 7 will be 194, the byte at memory address 0 will be 12, the byte at memory address 11 will be 38.

The Address Bus - How Does The Processor Access Memory?
The processor uses the address bus to access memory. The address bus is limited in size, however, so if you try accessing a memory address that is larger than the address bus then the address will be truncated. This kind of truncation is called wrap-around.

So if you have a 2-bit address bus (again, not realistic, but fine for this example), and you try accessing the memory address 101, then you'll end up accessing the addres 01, and the high-order part of your address will be ignored.

Let's say we have a 2-bit address bus and the same memory chip from above, with the same values. What would happen if you try accessing the byte at address 12? What about address 9?

Solution:

To find out the answer, we first need to get the binary version of the address.
12 to binary = 1100 (it's basically 1*(2^3) + 1*(2^2) + 0*(2^1) + 0*(2^0))
9 to binary = 1001 (1*(2^3) + 0*(2^2) + 0*(2^1) + 1*(2^0))

Since we have a 2-bit address bus, we need to truncate the addresses to form 2-bit values.
1100 to 2-bit = 00
1001 to 2-bit = 01

00 to decimal (from binary) = 0
01 to decimal = 1

So we are actually accessing bytes at physical addresses 0 and 1.

A physical address is an actual address in memory.
A virtual address is the address that we're trying to access; virtual addressing is used by a lot of operating systems, including Windows.

The Intel Architecture Software Developer's Manual volume 3 (Document) tells a lot about paging and virtual memory, if you want to read more about that.

Compiling A Program - How Is A Program Compiled?
To compile a program, a programmer usually compiles all the files that go into the project (besides the files that are included, of course). The compiler (the program) compiles the source code into object file(/s). Then you use a linker to link all the object files into one executable PE file. Though some compilers compile and link with one command.
But unless we're building a big project, one source code file is usually enough (we're not counting the includes).
Also, the linker doesn't know which code has the entry point (the point where the program execution starts), so that point must be specified. In NASM, that point can be set using the "..start:" special symbol; the program will start running from the point where the "..start:" label is located. This is usefull especially when you link more than one object files.
In assembly language, you don't compile files. What is called compiling for higher-level languages is actually called assembling for assembly language. But the other thing about it is that while compiling converts high-level statements to machine code (or assembly language, depending on the compiler), assembling converts single-line mnemonics to machine-level opcodes.

Note: An opcode is a special machine code that represents an instruction.
Also: An instruction is a processor-level unit of code that tells the processor what to do next.

16-Bit vs 32-Bit Code - What's The Difference?
When a computer first starts up, it is in 16-bit mode. It is the operating system that enters 32-bit mode and does everything else. But code written for 32-bit mode will most likely fail under 16-bit mode. That's because of references, such as "take this 32-bit value 00000100000001000000100010000001" , what would translate to "take this 16-bit value 0000010000000100" , while the other 8 bits will interfere with the next instruction, causing trouble.

Data vs BSS - Why Use BSS If We Have Data?
We can modify either section using code. But we can predefine one section, while we can't predefine the other.
With the data section, we use variables with predefined values, that we can change later on. But with the BSS section, we also use variables at predefined offsets (addresses), but we can't predefine the values, which usually start out with 0.
Using the BSS section helps reduce the size of our program file, somewhat.

There is another medium of storage, besides data and bss, about which I'll write in the next tutorial.

NASM Syntax - Some Things To Know
- Square brackets mean "at memory address."
- Operand-size prefixes (BYTE, WORD, DWORD, QWORD) are sometimes required and sometimes optional.
- <label_or_variable_name> means "the address of."
- Therefore, [<label_or_variable_name>] would mean "at memory address of."
- No more than one (1) memory reference allowed per instruction.
- Dollar sign ($) means the address of the current instruction.
- %define defines a single-line macro (similar to #define in C/C++) (see also, %undef).
- TIMES <n> <action> <n> times, do <action> (see also, %rep).
- You use 'extern' to tell NASM about external symbols that are not defined yet
and exist at the linking stage.

For more information about NASM and its syntax, you can refer to the NASM manual (NASM - The Netwide Assembler).

Byte, Word, Double-Word, Quad-Word - What Do Those Mean?
Those are just size specifications.
- A byte is 8 bits in size.
- A word is 16 bits in size (2 bytes).
- A double-word is 32 bits in size (4 bytes).
- A quad-word is 64 bits in size (8 bytes).

Data And BSS Sections - Defining Variables

Data
You use DQ to define a quad-word, DD to define a double-word, DW to define a word, and DB to define a byte.
myvar                       dd 65 
mystr                          db "Hello World!", 13, 10, 0
The above code will define myvar as a double-word, initialized to 65, and mystr as an array of bytes (an ANSI string), initialized to "Hello World!\r\n\0" .
These things are usually used in the data section.

BSS
You use RESQ to reserve a number of quad-words, RESD to reserve a number of double-words, RESW to reserve a number of words, and RESB to reserve a number of bytes.
my_int                resd 1 
my_str                   resb 512
The above code will reserve 1 double-word for my_int, and 512 bytes for my_str.
These things are usually used in the bss section.

Normally the things that we want to initially assign values to (ie "Hello World!", 65, etc.) are defined in the data section, while things that we don't know the value of, yet, we define in the BSS section.

More About Defining Data
We can either define data using quotes (ie "Hello World! ") or we can define data using ASCII numbers (ie 13, 10, 0). Let's take a look at how NASM assembles our data definitions:
http://forum.codecal...tachmentid=4109
Here's the output EXE code:
http://forum.codecal...tachmentid=4110
As you can see, our definitions sort of start from line 3 of the EXE file. They actually start from the end of line 2, but that's not in the picture.


On line 2 of the .asm file, we use 13 then 10, which makes a new line (hence, it's now line 3 of the PE file). Then we have 65 and "A" ; both of those evaluate to 'A' , because 65 is the ASCII code for the capital letter A. Then we also defined some things right after defining some_str.

So the data section is part of the executable file, and you can actually find an executable's data section, by examining its contents. But be careful about changing any part of an executable file, or you might cause the file to not be executable any more. I don't recommend changing any part of a PE file, unless you really know what you're doing.

Operand-Size Prefixes - When Are They Necessary?
Operand-size prefixes are required when operand sizes can't be infered (ie the PUSH instruction).
But operand-size prefixes can be used in other situations, too, although not required.







First Tutorial:
This is the first tutorial.

Previous Tutorial:
Not applicable.

Next Tutorial:
Part 2

Attached Thumbnails

  • db01_exe_cc.PNG
  • db01_asm_cc.PNG

  • 2


#606317 Click Sign in while pressing enter

Posted by RhetoricalRuvim on 02 August 2011 - 11:10 AM

The key code is 13.

And to submit a form, you do something like this:
<form id="form1" action="....."> .....whatever goes here..... </form> 
<script type="text/javascript"> 
// Submit the form under the id "form1". 
document.getElementById("form1").submit(); 
</script>

  • 1


#599988 How to write data to a hard drive

Posted by RhetoricalRuvim on 14 May 2011 - 10:46 PM

This link might help, I think:
Category:ATA - OSDev Wiki
  • 1


#592646 Starting assembly

Posted by RhetoricalRuvim on 26 February 2011 - 11:47 AM

The "commands" are called instructions. They can also be called opcodes or mnemonics. But yes, they are the same for any Intel or AMD processor.

Some resources for Linux, that I found:
Linux Assembly Tutorial - Step-by-Step Guide
Writing A Useful Program With NASM
Linux Assembly: resources
  • 1


#591478 Can anyone help me understand this asm code (It's short)

Posted by RhetoricalRuvim on 15 February 2011 - 04:30 PM

Maybe it's because the book assumes that you do have the access rights like that for the code segment; I think it would work in real-address mode, but a lot of programs, these days, are designed for protected mode. I'm pretty sure it's possible with protected mode, also, but operating systems like Windows usually set everything up so you don't tamper with the code segment.

About references, the Intel manuals are fairly good reference. There are older Intel processor manuals, such as the Intel Architecture Software Developer's Manual Volume 1, and there are also newer ones. There's also this book called The Peter Norton Programmer's Guide To The IBM PC (which is also known as the pink shirt book), which is copyright 1985 by Peter Norton, but if you want to learn more about the PC then you could get that book. The Intel manuals explain a lot (about protected mode and other things like this), though.

Intel Architecture Software Developer's Manual, Volume 1: Basic Architecture
Intel Architecture Software Developer's Manual, Volume 2: Instruction Set Reference Manual
Intel Architecture Software Developer's Manual Volume 3: System Programming
IntelĀ® 64 and IA-32 Architectures Software Developer's Manuals
  • 1


#591451 Can anyone help me understand this asm code (It's short)

Posted by RhetoricalRuvim on 15 February 2011 - 06:17 AM

Oh, I think I got it. You can't edit things that are in the code segment. If you want to put a '\0' byte after the string then you have to use a data segment. I tried the following and it generated an error:
.386 

.model flat, stdcall 

option casemap:none 

include \RS\include\ifiles.inc 

.data 

.data? 

.code 

start: 



jmp some_string 



print_some_string: 

pop eax ;; address of some_string 

mov ebx, eax 

mov byte ptr [ebx+20], 0 

push eax 

call StdOut 



push eax 

call ExitProcess 



some_string: 

call print_some_string 

db "Hello, how are you? ", 0 



end start

  • 1




Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download