Jump to content

Types of Assembly Language

- - - - -

  • Please log in to reply
13 replies to this topic

#1
Cory Duchesne

Cory Duchesne

    Newbie

  • Members
  • PipPip
  • 10 posts
Hello,

I'm currently reading a beginners book on Java, but in the opening chapter the book talks a bit about machine and assembly language, and it stirred my mind into a series of inquiries.

I'm a bit perplexed by what I'm reading here about assembly. It says: each computer understands it's own dialect of assembly language, making it practically impossible to write an assembly language program that could run on different kinds of computers without having to rewrite programs.

My first question, why do we always hear about different species of higher level programming like C, C++, Java, and Python, yet we never hear about the various species of assembly language?

Do assembly languages have their own editors in the same way that Python has IDLE or C++ has Visual C++ ?

I'm also curious about the usage of the word dialect. When we consider the various species of higher level programming language that I listed above, is it correct to consider each one a unique dialect? For instance, is C++ a unique dialect in contrast to other dialects like Java or Python. Am I using the word dialect correctly here?

If so, I'm wondering, what are some of the different dialects of assembly language? Do they have their own editors? Do modern pc's generally run more than one dialect?

#2
sfoulk526

sfoulk526

    Newbie

  • Members
  • Pip
  • 9 posts
Okay - for your first question. Assembly languages are not computer specific, but microprocessor specific. Most people do not want to program in an Assembly language, so they leave all the ASM programming to the compiler writers. The high level, 'cushy' languages are broken down into the ASM, then from the ASM into machine, the lowest level (except for microcode, but we don't need to go there for now).

Assembly languages do not have their own editors as far as I know, but the might. Visual C++ and IDLE are IDE's (Integrated Development Environments), made for the specific language by companies/developers for ease of development in that language. Any text editor will work.

Dialect - no, C++ compared to Java and Pyton would be a different language. Big difference. Closer to C++, it's prequel, would be C. Dialect could be but I would rather say it's predecessor...but some may disagree. No matter, if you learn C, you don't know C++, and vice versa. But the more languages you do learn, the better you will be able to learn!

Assembly languages are *sorta* like different dialects, but not. But, if you learn and become proficient at one, learning the next is waaaaaay easier. The difficulty is learning the first one. After, the barrier falls! To become proficient at Assembly (ASM), learn the architecture of the machine, e.g. memory, addressing, etc.

Hope this helps. I have learned and used quite a few different Assembly's - 8080A/8085, 8086/88, 80186, 80960, 68000 series, etc. The first one was the hardest.

After, all of them became real easy!

#3
dargueta

dargueta

    Writes binary right handed and hex left handed

  • Moderators
  • 4,705 posts
  • Programming Language:C, Java, C++, PHP, Python, Perl, Assembly, Bash, Others
  • Learning:JavaScript
SciTe and ConTEXT have syntax highlighting for Intel and other customizable ones for other languages. I think GEdit has one too but I'm not sure. You can play around with SciTe's syntax highlighter code to even let you compile it with NASM or whatever assembler you like.

As far as dialects, it depends on the kinds of assembly languages you're talking about. Some are relatively close, others are so different knowing one is not really going to help you learn the other. Once you get to thinking on a low level, however, it'll be much easier learning the next one, like sfoulk256 said. The real issue is just learning the differences in architecture.

@sfoulk256: Ever try creating a microcode language for your own processor? Sooo much fun. :D
sudo rm -rf /

#4
Cory Duchesne

Cory Duchesne

    Newbie

  • Members
  • PipPip
  • 10 posts
Thank you both for the replies. In my studies over the next month, I will reference this thread frequently, and perhaps ask some more questions.

#5
sfoulk526

sfoulk526

    Newbie

  • Members
  • Pip
  • 9 posts
@sfoulk256: Ever try creating a microcode language for your own processor? Sooo much fun.

@dargueta: Actually, I have toyed with creating a microprocessor instruction set. My aim was to create a RISC with the most minimal set. But I realized that could only be by making some instructions have multiple functions, creating nightmarishly complex operand sets just to keep my mnemonics count low. Sorta like overloading operators in C++. I prefer clear cut instructions. It makes it much easier when you come back later.

Then I found a professor of mine had actually done that - and he got his PhD for it! Unreal...

His instruction set had *5* instructions. That must've been a mess.
:cool:

#6
Cory Duchesne

Cory Duchesne

    Newbie

  • Members
  • PipPip
  • 10 posts
guys, can I ask you something. How important is it to know the architecture of a particular CPU when learning assembly?

Also, I got a question about assembly and how it relates to higher level languages. Does a higher level language like Python or C function by interpreting the assembly code of the specific CPU driving the computer? If so, are these higher level languages designed to interpret the many variations of assembly that speak to the different processors? Different computers are going to have different CPU's, and hence different dialects of assembly, and so are higher level languages designed to know all of the different assembly languages out there?

#7
sfoulk526

sfoulk526

    Newbie

  • Members
  • Pip
  • 9 posts
Well, I may want to let dargueta have this one as Assembly is fresher on his brain, but I'll give it a start.

It is very important to know the structure of the CPU as it pertains to its Assembly language. The cache's, registers, flags, et al. that can be (controlled | regulated | used | interpreted | read from | written to) from the CPU's version of Assembly s/b studied. So yes, yes, yes.

The C language and Python do not interpret the Assembly language. The C and Python compiler get the code you write (in C or Python), and translate it into Assembly. Then the Assembler, specific to that CPU and OS, converts it into the machine code (object files), which finally the linker pieces together into an executable. Usually. It may have changed in this regard these days, so I defer you to dargueta for the latest, contemporary version. That was how it was last I checked.

Hope this helps.

#8
Cory Duchesne

Cory Duchesne

    Newbie

  • Members
  • PipPip
  • 10 posts
Really appreciate the reply, I will return to it often, but I'm still a little unclear on compilers. I did some googling and wikiing, but what I still don't know is how can I tell what compiler I have running on my pc? Is it something that comes attached to whatever editor you use? For instance, I use IDLE for python. Does the IDLE editor contain the compiler program that break Python down into assembly?

#9
sfoulk526

sfoulk526

    Newbie

  • Members
  • Pip
  • 9 posts
I don't know IDLE - if its an interactive development environment (IDE) or just an editor. If it builds the executable for you, it's an IDE. If it's an IDE, then it either has the compiler code built into it, or it invokes the compiler externally. You'll have to read your docs that came with IDLE to find out.

#10
dargueta

dargueta

    Writes binary right handed and hex left handed

  • Moderators
  • 4,705 posts
  • Programming Language:C, Java, C++, PHP, Python, Perl, Assembly, Bash, Others
  • Learning:JavaScript
A couple of things:
1) Python is not compiled to native (e.g. Intel) machine code. It's like Java - compiled into "fake" machine code that requires a program to interpret it, known as a virtual machine. It's similar but not equivalent to the machine code for processors; the "fake" machine code (i.e. bytecode) is usually more high-level than the native machine code that the virtual machine is compiled into.

2) When compiling C code* it's the compiler's job to figure out what kind of machine (e.g. MIPS, Intel, ARM, etc.) that it's compiling for. GCC by default compiles for the machine that you're running, but it lets you specify the target machine if you like. The point of higher-level languages is to let you not worry about the details of the machine you're writing the program for. This is why properly written code in languages like C can be compiled for pretty much any processor--it's portable. The compiler is the one who worries about how to translate the high-level language into machine code, not you.

* Or any language that can be compiled into native machine language, like C++ and Visual Basic.

3) When coding in assembly language (Intel assembly language, anyway) knowing the architecture of the machine you're writing for depends on several things, namely a) how complicated your program is; b) how the processor manufacturer has made the language backwards-compatible; c) how the processor manufacturer has made the processor's architecture backwards-compatible; d) sometimes the operating system you're writing for. I'll give you a few examples; assume everything's Intel unless I say otherwise.

A) If you're using really special instructions like RDMSR (ReaD Model-Specific Register) and WRMSR (WRite Model-Specific Register) then you're going to have to know the architecture you're running on. There are lists of MSRs, their uses, ID numbers and respective models in Intel documentation. Somewhere. :D

B) A 16-bit program will always run on a 32-bit processor given certain conditions; a 16-bit program will not run on a 32-bit system if the 32-bit processor is running in protected mode and the 16-bit program uses interrupts. It's the operating system's job to switch to 16-bit real mode where the 16-bit program will run, and then switch back for other 32-bit programs that'll explode if run in 16-bit real mode.

C) There are other certain general things that you need to keep in mind when coding across 16- 32- and 64-bit mode. For example, a 32-bit program will execute just fine on a 64-bit processor in IA-32 and IA-32e modes. In 32-bit mode you can use the INC and DEC instructions, which will increment or decrement the operand you specify, respectively. However, you cannot use these instructions in 64-bit mode, as they've been remapped to different instruction prefixes. If your compiler isn't smart enough to catch this, using INC and/or DEC instructions in a 64-bit program will make it do weird things if it doesn't crash first.
Sometimes the changes are subtle. For example, in old MIPS architectures, branches and jumps had delayed execution because of the way their pipelines worked. This means that if you had this:
add     $t0, $t0, -1
bne     $t0, $0, top_of_loop
lw      $t9, 0($t0)
The lw instruction would always execute, regardless of whether the branch was taken or not. The instruction immediately following a branch is said to be in the delay slot. Kinda stupid, right? If you didn't want this behavior, you had to change your program either to this:
add     $t0, $t0, -1
lw      $t9, 0($t0)
bne     $t0, $0, top_of_loop
nop
or this:
add     $t0, $t0, -1
bne     $t0, $0, top_of_loop
nop
lw      $t9, 0($t0)
Eventually MIPS had to break backwards compatibility and get rid of the delay slot, so any programs that depended on this behavior to speed up loop execution were broken. The instruction set didn't change, just its execution. Moral of the story: try to avoid depending on execution-specific details when you can. Instruction set differences across architectures are okay as long as they're backwards-compatible, which they should be.

D) When making operating system-specific calls (e.g. if you're writing a device driver) you can't use one call for Linux and expect it to run on a Window$ machine. Windows will flash a Blue Screen of Death at you just long enough so you can see it but not long enough to read it. Then your computer will go up in flames.
Linux will probably just tell you to get rid of the driver.

4) If the compiler doesn't come with the IDE then you probably had to install it yourself. *NIX systems generally come with GCC preinstalled.

All right, I'm done now. Does that answer your question? :)
sudo rm -rf /

#11
Cory Duchesne

Cory Duchesne

    Newbie

  • Members
  • PipPip
  • 10 posts
**post withdrawn**

have to gather my thoughts more, sorry for this empty post.

Darguetta, thanks for the lengthy reply, I'll look it over when I get home.

#12
dargueta

dargueta

    Writes binary right handed and hex left handed

  • Moderators
  • 4,705 posts
  • Programming Language:C, Java, C++, PHP, Python, Perl, Assembly, Bash, Others
  • Learning:JavaScript
One 't', Cory. :)
sudo rm -rf /




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users