What programming language was written in assembly

Assembler basics

Preface and scope of this document

Some time ago a friend asked me about mouse support for a Turbo Pascal program. I recommended the Matin mouse unit to him. But then a large number of commands such as "mov", "int" etc. appeared, the functionality of which he could not explain.

I explained the basics to him in an extensive email. This document was created on the basis of this email. I have expanded and systematized the original text.

The document is intended to enable the reader to understand smaller programs written in assembler and possibly to design them himself. It is specifically about the use of assembler as support and extension of the high-level language Turbo Pascal. There is no point in designing sophisticated graphics engines for the document.

In my explanations I mainly refer to CPUs compatible with the 8086, since I only have more detailed experience with these. The basic principle is the same for all processors!

To the top of this page ...

What is assembler?

An assembler is a translator for program code made up of machine instructions. These commands can vary greatly depending on the processor used. Processors that are compatible with the 8086 from Intel are used in normal PCs. The instruction set has been expanded over the years, whereby the compatibility with the "original model" has always been maintained01.

The main difference to all other programming languages ​​lies in the type of commands used. While there are commands in the high-level languages02 If you combine several instructions in the final code in the translation, the assembler simply converts the assembler instruction into the corresponding binary form03 translated. The assembler also replaces variables with the corresponding memory addresses.

The high-level languages ​​have a variety of functions such as B. clrscr, writeln () etc. are available. Their realization would take several lines in assembler. With such standard functions, you should always use the functions of the high-level language.

Assembler programming is also often referred to as system programming. This word indicates the very special character. Assembler programs can usually only run on one processor platform and can only be ported to another system with great effort04. With an assembler it is not possible to set 8086 commands once and to use RISC commands for the next system, since the commands used and their use differ greatly.

To the top of this page ...

Basic knowledge of the CPU

Before you dive into system programming, you should first recall a few basic facts about the structure of the computer. The computer consists of a large number of individual components such as memory, hard drive, graphics and sound card and peripherals such as keyboard, monitor and printer. The focus is on the processor - the CPU. She has to coordinate all the devices with one another. The bus system is available for communication with the individual parts.

The CPU is divided again into the arithmetic unit, the control unit and the internal memory in the form of registers05. The registers are of particular interest to the programmer. In these the commands and their parameters are transferred to the processor. Return values ​​after executing a command are also stored in the registers. Certain recurring information is shown in the flags, which, however, are also registers according to the principle.

! The following information only refers to processors compatible with the 8086!

The 8086 had registers that were 16 bits in size06. This was justified on the one hand by the technical possibilities and on the other hand by the moderate demands on the hardware. These registers were assigned certain names from which the abbreviations can be derived after their task.

classAbbreviationimportanceMore information
General registersAXaccumulatorDivision into high byte (AH) and low byte (AL)
BXBasic registerDivision into high byte (BH) and low byte (BL)
CXCount registerDivision into high byte (CH) and low byte (CL)
DXData registerDivision into high byte (DH) and low byte (DL)
Pointer registerSPStack pointerused to address the stack
BPBase pointerused to address the stack
IPInstruction pointerOffset of the next command
Index registerSISource indexAddressing support
DIDestination indexAddressing support
Segment registerCSCode segmentpoints to the current code segment
DSData segmentpoints to the current data segment
SSStack segmentpoints to the current stack segment
ITExtra segmentpoints to another data segment

You should stay away from the pointer, index and segment registers at the beginning. These point to the memory areas with the program code and the stack.

In the overview you can see that the general registers are divided into a high and a low section. For some operations a "half" register is sufficient. Therefore you can address these sections individually.

As already mentioned, there are flags in addition to the registers. These are individual memory cells that each have the value 0 or 1. These flags (flag, switch) also have specific names from which their meaning and function can be derived. A distinction must be made between flags to be set by the processor (status flags) and flags that can be set by the program (control flags).

classAbbreviationSurnameimportancePut by
Status flagsCFCarry flagCarry flagprocessor
AFAuxiliary carry flagAuxiliary carry flagprocessor
ZFZero flagZero flagprocessor
SFSign flagSign flagprocessor
PFParity flagParity flagprocessor
OFOverflow flagOverflow flagprocessor
Control flagsTFTrap flagSingle step flagprogram
IFInterrupt enable flagInterrupt flagprogram
DFDirection flagDirection flagprogram

If the result of an operation is 0, the zero flag is set. That means it has the value 1. You can check and use this in your program.

With each new generation of processors, new flags are added. However, these are often not documented and are of no importance for simple assembler programming.

The stack is also available to the CPU and the programmer. This is a separate area in RAM07, which can be used for short-term intermediate storage. Turbo Pascal transfers values ​​internally to functions with the help of the stack. On 8086 systems, the last data "put" on the stack is released first. This is called the LIFO principle08. You can think of the stack as a stack of plates with a new plate always placed on top. If a plate is now required, take the top plate away first.

To the top of this page ...

From the life of a CPU

The reader should put himself in the following situation: The program to be executed is in main memory, the registers are set correctly and point to the next instruction and the stack. The CPU now loads the next instruction into the register and analyzes it. If parameters are required, then they are also loaded into the corresponding registers. The command is then executed and any return values ​​are stored in the registers. These can then be used by the next command. The CPU would now load the next instruction and work its way through the entire program code in this way.

However, unexpected events can occur during execution, to which the CPU has to react. Then the work is interrupted at the current position and the necessary actions are carried out. Such an interrupt could be the actuation of the keyboard by the user. In order to output the pressed key on the screen, another part of the program must be executed. Therefore the CPU branches to this point. It then returns to the point at which it was interrupted and continues executing the program.

The programmer can use such interrupts for his own purposes. So he can bend the timer interrupt to its own routine. This means that it is always called up after a certain time. There are also a number of predefined interrupts. These can be compared with the functions of high-level languages. The parameters are transferred in the general registers and then the corresponding interrupt is called. This allows you to z. B. address the mouse driver or the graphics card or boot the system.

To the top of this page ...

Let's take action!

The use of assembler is to be demonstrated with a small Turbo Pascal program (1226 bytes), how to make the mouse pointer visible, determine the current position of the mouse and determine the button status of the mouse. A mouse driver must be installed for the program to work09.

At the beginning of the program, the mouse pointer is made visible on the screen with the help of the "maus_an" procedure:

procedure mouse_an; assembler; asm mov ax, 01h int 33h end;

The first thing you notice is the keyword "assembler". This tells the compiler that the following procedure is completely written in assembler. The keyword "asm" is used as an introduction to the assembler block. The block is separated by an "end;" closed. Pure Turbo Pascal commands such as "writeln ()" etc. must not be placed in between.

Now the value 01h is shifted into the AX register of the processor with the help of the command "mov". The "h" stands for hexadecimal. Even if 1 is the same number in hexadecimal and decimal notation, you should get used to the hexadecimal notation10.

The "mov" command carries out a transfer from the source (right operand) to the destination (left operand). So you can transfer the value of AX to a variable as follows:

mov var, ax

But back to the example above. After the value 1 has been saved in register AX, interrupt 33h11 called. This is the mouse interrupt that is implemented by the mouse driver. In the process, certain standards have emerged. The function 01h of the mouse interrupt is a function for switching on the mouse cursor display. The corresponding function is selected for almost all interrupts via the AX register. In some cases the main function is indicated in the AH register and the sub-function in the AL register. Interrupt lists provide an overview of all common interrupts including their functions and should always be consulted. I advise against trying out a few interrupts and functions for the good of luck.

If everything goes well, the mouse cursor should now appear on the screen. You can move it too. The mouse driver takes over control and display.

Next, a procedure is to be written that is to call function 03h of the mouse interrupt. This provides the x-position of the mouse as return values ​​in the CX register and the y-position of the mouse in the DX register. The button status of the mouse is returned in the BL register. While the values ​​in CX and DX correspond to the coordinates, the values ​​in BL still have to be interpreted accordingly. A 0 in this case means that no key is pressed. The other values ​​can be found in the program! The procedure would look like this:

procedure mouse_status; assembler; asm mov ax, 03h int 33h mov status.x, cx mov status.y, dx mov status.taste, bl end;

The first 4 lines should be known. The remaining 4 lines do not represent a problem with the knowledge of the "mov" command either. The contents of the individual registers are shifted to a record.

All other functions of the program run analogously and should be compared with the examples given above.

To the top of this page ...

Further options for self-study

First, you should take a look at your local library. Sometimes you can find books on assembler there. These usually start with the basics and show possibilities for using interrupts.

At this point I would like to introduce 3 selected books. The book PC Underground (ISBN 3-8158-1185-6) has proven itself for the use of assembler in the high-level language Turbo Pascal. The use of system programming is explained here using vector graphics and sound players. The book is apparently no longer available through normal bookshops. One can therefore only try to find it from friends or acquaintances.

The situation with the second book is completely different in terms of availability. You could call it a kind of standard work, so to speak.

At this point I would like to point out that HPFSC receives a small expense allowance from Amazon.de if you buy the book via the corresponding links. This fee is not the reason for the recommendation, but the following facts:

  • very good structure
  • practical introduction for beginners
  • understandable expression of the author
  • Computer structure (bus system, CPU, peripherals, etc.) well presented
  • extraordinarily good price-performance ratio

Assembler programming language. A structured introduction.

Author: Pure backer
Price: 9.90 euros
ISBN: 3499612240
Publishing company: Rowohlt TB-V.
Pages: 352 in paperback format
 

If you are considering purchasing, we would be delighted if you would do so via this link!

The last recommendation is the book "Basics and Concepts of Computer Science" by Hartmut Ernst.

As the title suggests, the book is not solely concerned with assembly language programming. In a 50-page chapter, system programming is explained using the Motorola M68000 microprocessor. This chip is very interesting because the micro-controller 68HC11, which has a similar programming, is used in many embedded systems.

The author always explains the problem first in general and then specifically applies it to the programming of the M68000.

In addition to this very interesting chapter, the 800-page book offers a lot more. It covers essential components of the undergraduate degree in computer science (and related disciplines). The basics of information theory, switching algebra, software development, automation, algorithms and database technology are taught. There is also a chapter on programming in C and a small chapter on the Java programming language. Both chapters are sufficient to get you started, the rest can be found on the Internet.

The real awesome thing about the book, however, is the unbelievable price-performance ratio. Normally, specialist books on one of these topics cost almost € 50. The book, on the other hand, costs just € 5!

  • Attention: suitable for students, electrical engineers and engineers
  • Assembler programming explained with examples using the M68000
  • Theory is illustrated with meaningful examples
  • Book covers basic studies of computer science thematically (except mathematics)
  • Book can be used as a reference work
  • unbeatable value for money

Basics and concepts of computer science

Author: Hartmut Ernst
Price: 29.90 euros
ISBN: 3528257172
Publishing company: Vieweg Verlag Wiesbaden
Pages: 888
 

For this book, too, we would be delighted if you would make a possible purchase via this link!

Another little tip: Anyone who has envisaged studying computer science can easily determine with this book whether this is the right discipline for them, because when reading it, it quickly becomes clear that a computer scientist is not a programmer!

If you have any questions about special problems, don't forget to take a look at de.comp.lang.assembler.x86.

If you are interested in 3D engines and various effects, you can of course also find an almost unmanageable amount of documents with a search engine.

A complete list of interrupts can also be found on the Internet.

To the top of this page ...

Sources of supply for assemblers

In addition to the integration of assembler in the well-known high-level languages ​​such as Turbo Pascal and C, there are of course also completely autonomous assemblers. These all have their specific advantages and disadvantages, but you should always convince yourself of them!

The program is then created with the help of a text editor and passed to the compiler as a parameter. The creation of such programs is very tedious and takes a lot of time. It is therefore not suitable for programming the 508th version of a vocabulary trainer, unless it is based on speech recognition !!!

SurnameManufacturercosts
A86Eric Issacson SoftwareShareware
MASMMicrosoftIncluded in Visual C ++ 6.0 and possibly included in the DirectX Development Kit. Just look for masm.
TASMBorlandThere used to be the TASM z. B. to C ++ Builder 5.0

To the top of this page ...

A few more final words

Getting started with assembler programming is not easy. Today's current compilers are so well exhausted that in most cases they produce highly optimized code. This can only be improved after a long study of the topic.

On the other hand, the system programming offers a fascinating insight into the interaction of the individual components of the computer. You can learn a lot of interesting things about the tin box and react better if you make mistakes!

You shouldn't give yourself the illusion that you can program the best graphics and sound engines with a bit of assembler. This always includes a good knowledge of mathematics (which calls analytical geometry) as a second basis. However, assembler can be seen as a gateway to cracking programs, even if it is a dubious subject12!

Whatever you, the willing reader, are planning to do, I hope that my text has made it easier for you to get started in a very complex area. I am waiting for criticism and suggestions for improvement.