Jump to content


Check out our Community Blogs

Register and join over 40,000 other developers!


Recent Status Updates

View All Updates

Photo
- - - - -

Assembly, Using Memory Allocation and Alphabet Algorithm (Win32, NASM)

assembly

  • Please log in to reply
No replies to this topic

#1 RhetoricalRuvim

RhetoricalRuvim

    JavaScript Programmer

  • Expert Member
  • PipPipPipPipPipPipPip
  • 1307 posts
  • Location:C:\Countries\US
  • Programming Language:C, Java, C++, PHP, Python, JavaScript

Posted 19 August 2011 - 06:05 PM

In this tutorial we'll look at two more Win32 API functions and, after that, make another algorithm function that would use one of those Win32 API functions. In other words, we'll learn about more Win32 API functions, after what we'll practice using what we learned.

Overview
  • Two New Win32 API Functions
  • A New Algorithm
  • Example Program




Two New Win32 API Functions
When a program accesses memory from either the code, data, or bss section, it doesn't use the actual memory address; programs use this type of address called a Relative Virtual Address (RVA). The processor then uses page tables and segmentation to translate that relative virtual address into a physical memory address.

When a program is loaded into memory, it's loaded to a specific virtual address.



What Is A Virtual Address?
A virtual address is used by programs just like a regular memory address. Then there are these things called page tables, into which the processor looks to see where the actual memory address is.

Let's say, for example, that the program is trying to access memory address 0x823. The page table for our program, that the operating system made, looks something like this:
Page Size: 64
Entries:
at physical address 0x500
not present
not present
at physical address 0x18000
not present
at physical address 0x12000
not present
not present
not present
.....
(this is entry number 32, now)
at physical address 0x24000
.....

Once again, I never worked with page tables before, but I read a little bit about them, so this is sort of what the actual table looks like. For reference to the Intel page tables, read the Intel manual, system programming.

The processor would then find entry number 32, because that's where 0x800 starts from. Then it would add 0x23 and 0x24000 to get the physical memory address to access.

If the program tries accessing another memory location, for which the page table is marked as not present, the processor would notify the operating system about the issue and the operating system would load the correct page into memory - usually from the hard drive or another storage device - and then direct the processor back to the program, so that it can continue running.
When the operating system loads the desired page into memory, it also usually stores an unused page to the storage device. That way, even if the computer only has 2 GB of RAM, the program would still be able to access 4 GB of virtual memory.

So yeah, a virtual address is basically an address that's used for addressing, while the processor and the operating system take care of the actual physical memory and storage management.


The RVA is the virtual address, relative to where the program has been loaded; it could also be relative to where the program has been loaded + some number, depending on how the linker sets things up in the PE file's headers.

But when you use the bss section, you have to determine how much memory you need when you write the program. Shat if you end up needing more memory than you though you would need? Or what if you end up needing less memory, so you'll be taking up more resources than you need?

In cases like this, we can ask Windows for more memory. But, you need to remember, when you're done using the memory - just like when you're done using a file - you have to tell Windows about that.

You wouldn't be very happy if someone borrowed something from you and never brought it back, would you?

So it's something like "Windows, can I borrow 4 more KB of memory? ... Okay, Windows, I'm done using that, you can take it back now."

GlobalAlloc - Asking Windows For More (Global) Memory
GlobalAlloc() has the following parameters:
  • Flags
  • Number of bytes to allocate (borrow)

The return value is a pointer to the newly-allocated memory.

For more reference to the GlobalAlloc() Win32 API function, visit GlobalAlloc Function (Windows).

GlobalFree - Telling Windows That You're Done Using The Allocated (Global) Memory
GlobalFree() has only one parameter:
  • The pointer to the memory location to free.

For more reference to the GlobalFree() Win32 API function, visit GlobalFree Function (Windows).





A New Algorithm
We'll make an alphabet algorithm, now.

Before finding out what the algorithm does, it might help to know its parameters:
  • Starting letter.
  • Letter to stop after.
  • Whether or not to put spaces between every letter.

This algorithm returns a (pointer to a) string with letters of the alphabet in it, starting from the letter parameter1, stopping after the letter parameter2; if parameter3 is non-zero, also put spaces between every letter in the string.

Keeping track of the current letter is a big part of the algorithm.

The returned string is contained in a buffer allocated with the GlobalAlloc() Win32 API function, so you should free the pointer to the string, when you no longer need it, using the GlobalFree() Win32 API function.

I'll explain more details using comments, in the code. Here's the code for the abc() function:
;; abc() - returns a string with the alphabet. 
;; parameters: 
;;  	the letter to start from 
;;  	the letter to stop after 
;;  	whether to seperate each letter from another letter with a space 
;; return value: 
;;  	the pointer to the new string with the letters 
abc: 
	enter 4, 0 
	push dword 0 
	push ebx                          ;; Save EBX. 
	
	;; First of all, we need to run a scan of the letters, without saving the values. 
	xor ebx, ebx                      ;; This should tell the loop not to save the values. 
	call .the_loop                    ;; Call the letter scan loop. 
	
	;; The total size of the new string should be equal to the value returned 
	;; by .the_loop in ECX. 
	;; We need to save that number. 
	mov eax, ecx 
	mov dword [ebp-8], eax 
	
	;; Now we need to ask Windows to allocate some memory for us. 
	;mov eax, dword [ebp-8]           ;; EAX is already [ebp-8]. 
	inc eax                           ;; We increment that number, because we'll need an extra byte for the NULL terminator. 
	push eax 
	push dword 0                      ;; No flags, for now. 
	call [GlobalAlloc]                ;; Call the Windows global memory allocation API function. 
	mov dword [ebp-4], eax            ;; Save the pointer that GlobalAlloc() returned. 
	
	mov ebx, eax                      ;; Also use that pointer for the .the_loop function. 
	call .the_loop                    ;; Now we run the loop and save everything to the new string. 
	
	mov eax, dword [ebp-4]            ;; We would return the pointer to the new string. 
	
	jmp .finish                       ;; We'll need space to define our nested function, 
	;; so we'll have to jump over the nested function 
	;; to the .finish label. 
	
	.the_loop: 
		;; EBX is the pointer to the buffer for the new string. 
		
		push edi                      ;; Save EDI. 
		
		xor ecx, ecx                  ;; We're supposed to be counting how many characters the new string would have. 
		;; We start counting from 0, for now. 
		
		mov eax, dword [ebp+12]       ;; Get the letter to stop at. 
		cmp eax, 122 
		jng .the_loop_over1 
			;; If the letter to stop after is greater than 'z', set it to 'z'. 
			mov eax, 122 
		.the_loop_over1: 
		cmp eax, 65 
		jnl .the_loop_over2 
			;; If the letter to stop after is less than 'A', set it to 'A'. 
			mov eax, 65 
		.the_loop_over2: 
		mov edi, eax                  ;; Save that letter in EDI. 
		
		mov eax, dword [ebp+08]       ;; Get the letter to start from. 
		mov edx, eax 
		
		.the_loop1: 
			;; Check if it's time to stop the loop yet. 
			cmp edx, edi              ;; Compare the current letter to the letter to stop after. 
			jg .the_loop1s            ;; If the current letter is greater, break the loop. 
			
			cmp edx, 65 
			jnl .the_loop1over1 
				;; If the current letter is less than 'A', set it to 'A'. 
				mov edx, 65 
			.the_loop1over1: 
			
			cmp edx, 122 
			jng .the_loop1over2 
				;; If the current letter is greater than 'z', reset it to 'A'. 
				mov edx, 65 
			.the_loop1over2: 
			
			cmp edx, 97 
			jnl .the_loop1over2b 
				cmp edx, 90 
				jng .the_loop1over2b 
					;; If the current letter is less than 'a' and greater than 'Z' (not a letter character), 
					;; set the current letter to 'a'. 
					mov edx, 97 
			.the_loop1over2b: 
			
			cmp ebx, 0 
			jz .the_loop1over3 
				;; If EBX is not a NULL pointer (meaning if we're supposed to save the output), 
				;; do the following: 
				
				;; Get the current letter. 
				mov eax, edx 
				
				;; Save the current character. 
				mov byte [ebx], al 
				
				;; Increment the pointer. 
				inc ebx 
				
				cmp dword [ebp+16], 0 ;; Check if the third parameter is FALSE. 
				jz .the_loop1over3    ;; If so, skip over the space-adding part of the code. 
				
				;; Otherwise, save a space to where the pointer is pointing. 
				mov byte [ebx], 32 
				
				;; And increment the pointer. 
				inc ebx 
			.the_loop1over3: 
			
			cmp dword [ebp+16], 0     ;; Check if the third parameter is FALSE. 
			jz .the_loop1over4 
			
			;; If it's TRUE, increment the count an extra time. 
			inc ecx 
			
			.the_loop1over4: 
			
			;; In any case, we'll still need to increment the count. 
			inc ecx 
			
			;; Increment the current letter. 
			inc edx 
			
			;; Continue the loop. 
			jmp .the_loop1 
		.the_loop1s: 
		
		;; If put spaces between the letters. 
		cmp dword [ebp+16], 0 
		jz .the_loop_over3 
			;; Since the last letter doesn't need a following space, we'll decrement the character count. 
			dec ecx 
			
			;; If the pointer is NULL, then this next part of the code is to be skipped. 
			cmp ebx, 0 
			jz .the_loop_over4 
			
			;; But we still put a space for the last character, so we'll have to put a NULL to that position. 
			mov byte [ebx-1], 0 
			
			;; .the_loop_over3 is for not-putting-spaces code. 
			jmp .the_loop_over4 
		.the_loop_over3: 
			;; If the pointer is NULL, then this next part of the code is to be skipped. 
			cmp ebx, 0 
			jz .the_loop_over4 
			
			;; Save a NULL character at [pointer]. 
			mov byte [ebx], 0 
		.the_loop_over4: 
		
		pop edi                       ;; Restore EDI. 
	ret 0 
	
	.finish: 
	
	pop ebx                           ;; Restore EBX. 
	leave 
ret 12






Example Program


Example Program - The Idea
  • Display a message box with the alphabet 'A' through 'z' (meaning, first 'A' through 'Z' and then 'a' through 'z'); no spaces between letters.
  • Display a message box with the alphabet 'F' through 'P'; no spaces between letters.
  • Display a message box with the alphabet 'a' through 'z'; put spaces between letters.

Example Program - The Code
;; Define the externs. 
extern MessageBoxA 
extern ExitProcess 
extern GlobalAlloc 
extern GlobalFree 

;; Construct our symbol import table. 
import MessageBoxA user32.dll 
import ExitProcess kernel32.dll 
import GlobalAlloc kernel32.dll 
import GlobalFree kernel32.dll 

;; This is the code section; use 32-bit code. 
section .text use32 
;; Start execution here. 
..start: 

;; Call the main() function. 
call main 

;; Exit, returning whatever main() retured. 
push eax 
call [ExitProcess] 

main: 
	enter 4, 0 
	
	push dword 0                      ;; Don't put spaces between every letter. 
	push dword 122                    ;; Stop after 'z'. 
	push dword 65                     ;; Start at 'A'. 
	call abc 
	mov dword [ebp-4], eax            ;; Save the pointer to the string. 
	
	;; Display a message box with the new string. 
	push dword 0 
	push dword the_title 
	push dword [ebp-4] 
	push dword 0 
	call [MessageBoxA] 
	
	;; Free the buffer for the string. 
	push dword [ebp-4] 
	call [GlobalFree] 
	
	push dword 0                      ;; Don't put spaces between every letter. 
	push dword 80                     ;; Stop after 'P'. 
	push dword 70                     ;; Start at 'F'. 
	call abc 
	mov dword [ebp-4], eax            ;; Save the pointer to the string. 
	
	;; Display a message box with the new string. 
	push dword 0 
	push dword the_title 
	push eax                          ;; Since EAX is already equal to [ebp-4], why not just use EAX? 
	push dword 0 
	call [MessageBoxA] 
	
	;; Free the buffer. 
	push dword [ebp-4] 
	call [GlobalFree] 
	
	push dword 1                      ;; Put spaces between every letter. 
	push dword 122                    ;; Stop at 'z'. 
	push dword 97                     ;; Start at 'a'. 
	call abc 
	mov dword [ebp-4], eax            ;; Save the pointer to the string. 
	
	;; Display a message box with the new string. 
	push dword 0 
	push dword the_title 
	push dword [ebp-4] 
	push dword 0 
	call [MessageBoxA] 
	
	;; Free the string buffer. 
	push dword [ebp-4] 
	call [GlobalFree] 
	
	;; Return 0. 
	xor eax, eax 
	leave 
ret 

;; abc() - returns a string with the alphabet. 
;; parameters: 
;;  	the letter to start from 
;;  	the letter to stop after 
;;  	whether to seperate each letter from another letter with a space 
;; return value: 
;;  	the pointer to the new string with the letters 
abc: 
	enter 4, 0 
	push dword 0 
	push ebx                          ;; Save EBX. 
	
	;; First of all, we need to run a scan of the letters, without saving the values. 
	xor ebx, ebx                      ;; This should tell the loop not to save the values. 
	call .the_loop                    ;; Call the letter scan loop. 
	
	;; The total size of the new string should be equal to the value returned 
	;; by .the_loop in ECX. 
	;; We need to save that number. 
	mov eax, ecx 
	mov dword [ebp-8], eax 
	
	;; Now we need to ask Windows to allocate some memory for us. 
	;mov eax, dword [ebp-8]           ;; EAX is already [ebp-8]. 
	inc eax                           ;; We increment that number, because we'll need an extra byte for the NULL terminator. 
	push eax 
	push dword 0                      ;; No flags, for now. 
	call [GlobalAlloc]                ;; Call the Windows global memory allocation API function. 
	mov dword [ebp-4], eax            ;; Save the pointer that GlobalAlloc() returned. 
	
	mov ebx, eax                      ;; Also use that pointer for the .the_loop function. 
	call .the_loop                    ;; Now we run the loop and save everything to the new string. 
	
	mov eax, dword [ebp-4]            ;; We would return the pointer to the new string. 
	
	jmp .finish                       ;; We'll need space to define our nested function, 
	;; so we'll have to jump over the nested function 
	;; to the .finish label. 
	
	.the_loop: 
		;; EBX is the pointer to the buffer for the new string. 
		
		push edi                      ;; Save EDI. 
		
		xor ecx, ecx                  ;; We're supposed to be counting how many characters the new string would have. 
		;; We start counting from 0, for now. 
		
		mov eax, dword [ebp+12]       ;; Get the letter to stop at. 
		cmp eax, 122 
		jng .the_loop_over1 
			;; If the letter to stop after is greater than 'z', set it to 'z'. 
			mov eax, 122 
		.the_loop_over1: 
		cmp eax, 65 
		jnl .the_loop_over2 
			;; If the letter to stop after is less than 'A', set it to 'A'. 
			mov eax, 65 
		.the_loop_over2: 
		mov edi, eax                  ;; Save that letter in EDI. 
		
		mov eax, dword [ebp+08]       ;; Get the letter to start from. 
		mov edx, eax 
		
		.the_loop1: 
			;; Check if it's time to stop the loop yet. 
			cmp edx, edi              ;; Compare the current letter to the letter to stop after. 
			jg .the_loop1s            ;; If the current letter is greater, break the loop. 
			
			cmp edx, 65 
			jnl .the_loop1over1 
				;; If the current letter is less than 'A', set it to 'A'. 
				mov edx, 65 
			.the_loop1over1: 
			
			cmp edx, 122 
			jng .the_loop1over2 
				;; If the current letter is greater than 'z', reset it to 'A'. 
				mov edx, 65 
			.the_loop1over2: 
			
			cmp edx, 97 
			jnl .the_loop1over2b 
				cmp edx, 90 
				jng .the_loop1over2b 
					;; If the current letter is less than 'a' and greater than 'Z' (not a letter character), 
					;; set the current letter to 'a'. 
					mov edx, 97 
			.the_loop1over2b: 
			
			cmp ebx, 0 
			jz .the_loop1over3 
				;; If EBX is not a NULL pointer (meaning if we're supposed to save the output), 
				;; do the following: 
				
				;; Get the current letter. 
				mov eax, edx 
				
				;; Save the current character. 
				mov byte [ebx], al 
				
				;; Increment the pointer. 
				inc ebx 
				
				cmp dword [ebp+16], 0 ;; Check if the third parameter is FALSE. 
				jz .the_loop1over3    ;; If so, skip over the space-adding part of the code. 
				
				;; Otherwise, save a space to where the pointer is pointing. 
				mov byte [ebx], 32 
				
				;; And increment the pointer. 
				inc ebx 
			.the_loop1over3: 
			
			cmp dword [ebp+16], 0     ;; Check if the third parameter is FALSE. 
			jz .the_loop1over4 
			
			;; If it's TRUE, increment the count an extra time. 
			inc ecx 
			
			.the_loop1over4: 
			
			;; In any case, we'll still need to increment the count. 
			inc ecx 
			
			;; Increment the current letter. 
			inc edx 
			
			;; Continue the loop. 
			jmp .the_loop1 
		.the_loop1s: 
		
		;; If put spaces between the letters. 
		cmp dword [ebp+16], 0 
		jz .the_loop_over3 
			;; Since the last letter doesn't need a following space, we'll decrement the character count. 
			dec ecx 
			
			;; If the pointer is NULL, then this next part of the code is to be skipped. 
			cmp ebx, 0 
			jz .the_loop_over4 
			
			;; But we still put a space for the last character, so we'll have to put a NULL to that position. 
			mov byte [ebx-1], 0 
			
			;; .the_loop_over3 is for not-putting-spaces code. 
			jmp .the_loop_over4 
		.the_loop_over3: 
			;; If the pointer is NULL, then this next part of the code is to be skipped. 
			cmp ebx, 0 
			jz .the_loop_over4 
			
			;; Save a NULL character at [pointer]. 
			mov byte [ebx], 0 
		.the_loop_over4: 
		
		pop edi                       ;; Restore EDI. 
	ret 0 
	
	.finish: 
	
	pop ebx                           ;; Restore EBX. 
	leave 
ret 12 

;; The data section. 
section .data 
the_title                                             db "Memory Alphabet Example", 0 

;; We don't have to define every section in our source code; NASM would do the defining even if we don't.

Example Program - The Output
You should get three message boxes in a row, for the output. The message boxes should say:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
FGHIJKLMNOP
a b c d e f g h i j k l m n o p q r s t u v w x y z

Here's a screenshot of the third message box:
http://forum.codecal...tachmentid=4153











First Tutorial:
Intro To Win32 Assembly, Using NASM

Previous Tutorial:
File I/O and 'incbin'

Next Tutorial:
Handling Bugs In Your Programs






References:
Intel Manual System Programming Guide: Intel® 64 and IA-32 Architectures Software Developer?s Manual Volume 3 (3A & 3B): System Programming Guide
GlobalAlloc Win32 API Function: GlobalAlloc Function (Windows)
GlobalFree Win32 API Function: GlobalFree Function (Windows)

Attached Thumbnails

  • MemoryAlphabet_output_cc.PNG

Edited by RhetoricalRuvim, 20 August 2011 - 07:01 PM.

  • 1





Also tagged with one or more of these keywords: assembly