Download Emu8086 Microprocessor Emulator

0 views
Skip to first unread message

Tordis Hurrle

unread,
Aug 3, 2024, 5:36:14 PM8/3/24
to quamsgeantleali

The emu8086 microprocessor emulator is a software tool developed by emu8086. This emulator provides users with an environment in which they can write, test, and debug microprocessor programs for the Intel x86 architecture.

The emu8086 emulator features a simple and intuitive interface that allows users to write their programs in assembly language directly within the application. Additionally, the emulator includes a built-in debugger that enables users to test their code step-by-step and identify any errors or bugs in their program.

With the emu8086 emulator, users can also simulate the execution of their code on an Intel-based microprocessor, giving them a clear understanding of how their program will run on real hardware. This improved understanding can help developers optimize their programs for better performance.

In summary, the emu8086 microprocessor emulator is a feature-rich and user-friendly tool that allows developers to write, test, and debug assembly programs for Intel x86 architecture on a simulated microprocessor at ease.

Assembly language is a low-level programming language that is very fast, uses fewer resources compared to higher-level languages, and can be executed by translating directly to machine language via an assembler. According to Wikipedia:

In computer programming, assembly language is any low-level programming language with a very strong correspondence between the instructions in the language and the architecture's machine code instructions.

We know that a processor (also known as CPU - Central Processing Unit) executes all types of operations, effectively working as the brain of a computer. However, it only recognizes strings of 0's and 1's. As you can imagine, it's cumbersome to code in machine language. So, the low-level assembly language was designed for a specific family of processors that represents various instructions in symbolic code which is far easier to understand for a human being. But, as you can also guess, it's difficult and somewhat inconvenient to develop in assembly language.

Assemblers are programs that translate assembly language code to its equivalent machine language code. There are many assemblers targeting various microprocessors in the market today like MASM, TASM, NASM, etc. For a list of different assemblers, visit this Wikipedia page.

Code editors are software in which you can write the code, modify and save it to a file. Some editors that support assembly language are VS code, DOSBox, emu8086, and so on. Online assemblers are also available, like the popular online editor Ideone. We will use emu8086, which comes with the environment needed to start our journey in assembly language.

We can simply write the assembly code and emulate it in emu8086, and it'll run. However, without calling the exit statements or halt instruction, the program will continue executing the next instruction in memory until it is halted by OS or emu8086 itself. The assembly code is saved in a .asm file type.

There are also some good practices like defining the model and stack memory size at the very beginning. For small model, define data and code segment after the stack. The code segment contains the code to execute. In the example structure given here, I have created a main procedure (also called function or methods in other programming languages), in which the code execution starts. At the end of it, I have called a specific predefined statement with interrupt to indicate the code has finished executing.

The first line, .model small, defines the memory model to use. Some recognized memory models are tiny, small, medium, compact, large, and so on. The small memory model supports one data segment and one code segment that are usually enough to write small programs. The following line .stack 100H defines the stack size in hexadecimal numbers. The equivalent decimal number is 256. The lines starting with, or part of the line after, ; are comments that the assembler ignores.

Registers are superfast memory directly connected to the CPU. The emu8086 can emulate all internal registers of the Intel 8086 microprocessor. All of these registers are 16-bit long and grouped into several categories as follows,

Addition (ADD) and Subtraction (SUB): ADD adds the data of the destination and source operand and stores the result in destination. Both operands should be of the same type (words or bytes), otherwise, the assembler will generate an error. The subtraction instruction subtracts the source from destination and stores the result in destination.

Label: A label is a symbolic name for the address of the instruction that is given immediately after the label declaration. It can be placed at the beginning of a statement and serve as an instruction operand. The exit: used before is a label. Labels are of two types.

Symbolic Labels: A symbolic label consists of an identifier or symbol followed by a colon (:). They must be defined only once as they have global scope and appear in the object file's symbol table.

Numeric Labels: A numeric label consists of a single digit in the range zero (0) through nine (9) followed by a colon (:). They are used only for local reference and excluded in the object file's symbol table. Hence, they have a limited scope and can be re-defined repeatedly.

Jump instructions: The jump instructions transfer the program control to a new set of instructions indicated by the label provided as an operand. There are two types of jump instructions.

Conditional jump: These instructions are used to jump only if a condition is satisfied and called after CMP instruction. This instruction first evaluates if the condition is satisfied through flags, then jumps to the label given as operand. It is pretty similar to if statements in other programming languages. There are 31 conditional jump instructions available in 8086 assembly language.

In an assembly program, all variables are declared in the data segment. The emu8086 provides some define directives for declaring variables. Specifically, we'll use DB (define byte) and DW (define word) directives in this article which allocates 1 byte and 2 bytes respectively.

Following is an example of variable declaration, where we initialize num and char with a value that can be changed later. The output is initialized with a string and has a dollar symbol ($) at the end to indicate the end of string. The input_char is declared without any initial value. We can use ? to indicate that the value is currently unknown.

We cannot use the variables in the code segment just yet! For using these variables in the code segment, we have to first move the address of the data segment to the DS (data segment) register. Use this line at the beginning of the code segment to import all variables.

The emu8086 assembler supports user input by setting a predefined value 01 or 01H in the AH register and then calling interrupt (INT). It will take a single character from the user and save the ASCII value of that character in the AL register. The emu8086 emulator displays all values in hexadecimal.

The emu8086 supports single character output. It also allows multi-character or string output. Similar to taking input, we have to provide a predefined value in the AH register and call interrupt. The predefined value for single character output is 02 or 02H and for string output 09 or 09H. The output value must be stored in the general-purpose data register before calling interrupt.

As shown in the code, for a single character output, we store the value in the DL register because a character is one byte or 8 bits long. However, for string output it is a bit different. We must load the effective address (address with offset) of the string variable in the DX register using LEA instruction. The string variable must be defined in data segment.

There is also JMP instruction that works similar to else statements found in higher-level languages. Following is an assembly code that compares AL register value to 5 and sets an appropriate value in the BL register.

We can also use loops in assembly language. However, unlike higher-level language, it does not provide different loop types. Though, the emu8086 emulator supports five types of loop syntax, LOOP, LOOPE, LOOPNE, LOOPNZ, LOOPZ, they are not flexible enough for many situations. We can create our self-defined loops using condition and jump statements. Following are various types of loops implemented in assembly language, all of which are equivalent.

The for loop has an initialization section where loop variables are initialized, a loop condition section, and finally, an increment/decrement section to do some calculation or change loop variables before the next iteration. Following is an example for loop in C language.

Unlike for loop, while loop has no initialization section. It only has a loop condition section, which if satisfied, executes the body part. In the body part, we can do some calculations before the next iteration. Following is an example while loop in C language.

Similar to the while loop, the do-while loop has a loop condition section and body. The only difference is that the code in the body executes at least once, even if the condition evaluates to false. Following is an example do-while loop in C language.

In the Inc folder, there is a file emu8086.inc, which defines some useful procedures and macros that can make coding easier. We have to include the file at the beginning of our source code to use these functionalities.

Let's solve a problem that uses all that we learned so far. The task is to input a number (1-9) from the user and print a reverse triangle shape using # in the console. Also, appropriate error messages should be displayed, if the user inputs an invalid character. A demo output shown in the image.

Now comes the tricky part. We cannot use a single for loop to print a reverse triangle shape. For this, we have to use two loops one inside the other, also known as nested loops. In the outer loop, we can check how many lines are to be printed and also print the new line at the beginning or the end. The inner-loop can be used to print #.

c80f0f1006
Reply all
Reply to author
Forward
0 new messages