![]() |
![]() |
Sega Megadrive Assembly Programming BasicsCopyright, Lewis Bassett, December 2005
Editors note: This chapter actually turned out to be very big, so I've splitted the chapter into 6 different sections, each explaining a main part of the M68000 assembly language. This part of the document is very boring, and very tedious, but is crucial to producing working Sega code. As I explained in the last chapter, computers and games consoles actually run using encoded binary instructions. Binary instructions are very messy and are hard to work with, so most programmers use assembly language instead, which is much easier to understand and work with. Assembly language uses simple english language keywords to represent basic processor operations. Each assembly instruction tells the processor to do one single small task. There are many different processors around, and each one has it's own assembly language, so the assembly language used by the M68000 is refered to as 'M68000 Assembly', or sometimes just '68k ASM'. During a process called assembly, assembly language instructions are converted to binary machine code instructions which are used by the processor. This is done by special software called an assembler.
Assembly instructions are written into a source file. A source file is the same as any other normal plain text file (.txt), except it has the extention '.asm' at the end of it's name. Any sofware that writes text as plain ASCII can be used to write source files.
As well as assembly instructions, assembly source files also contain variable data (eg, graphics or sounds), defined data and information for the assembler software. The AS assembler can be used to assemble languages for many different types of processors with a variety of different options, so at the beginning of every main source file, we need to include some instructions for the assembler. At the beginning of a main source file, we need to always include the following lines:
cpu 68000 PADDING ON SUPMODE ON SECTION CODEThe first line, 'cpu 68000', informs the assembler that we want to assemble instructions for the 68000 processor. The second line, 'PADDING ON', is instructs the assembler to pad the final ROM to a valid size, and allows us to insert blank segments into our final ROM. The third line, 'SUPMODE ON', allows the assembler to assemble privalidged instructions, which can only run when the processor runs in supervisor mode, which the Megadrive does. The final line, 'SECTION CODE' lets the assembler know that the following lines are assembly instructions. These four lines are very important, and are needed in order for the assembler to assemble our programs correctly. All assembly instructions and program data follow these fours lines. At the end of a main source file, we need to insert these the following line: ENDSECTIONThis line informs the assembler that it's reached the end of the code section. Assembly instructions can also be written into another source file and inserted into a main source file when it's assembled. This is useful because it allows us to use the same certain blocks of code for many different projects, without us having to re-write them everytime. Including other source files also allows you to seperate a big program into many different files, which each contain a certain part of the program. Files can be inserted with the 'include' directive: include "FileName.asm";Notice that the filename is contained inside double quotes ("), and a semi-colon is placed at the end of the directive. Files from different directories can also be inserted too: include "Folder/FileName.asm";Raw binary data can also be included into the final ROM by using the 'binclude' directive: binclude "Data/Artwork.dat";With the 'binclude' directive, files with any extension can be included, but it's up to you to make sure that the data contained inside is correct. Later on, we'll write some common fuction which we'll save as seperate files, so that we can use them in many different projects. Source files which are included into different projects in this way are called modules. Note: common modules can be downloaded from the source code section. Modules don't have to have the four lines at the beginning, and the one line at the end (the lines I explained earlier), as long as the main source file which it's inserted into does. At this point, it's importand to understand the difference between the assembler directives, and actual assembly instructions. Assembler directives, including all the lines mentioned above, do not actually produce any machine code. They simply inform the processor to do various tasks on the souce code, before it's assembled. Assembly instructions however, each produce one machine code instruction. Assembly source code files can also contain comments. Comments always follow a semi-colon (;) and are completly ignored by the assembler. Example: ; This program was written by meThis line will not effect the final ROM. Comments are useful because they allow us to document our programs, and explain what's happening. Just because we understand our programs when we write them, it doesn't mean we'll understand them when we want to modify them a year later!
Assembly instructions tell the processor to do a single simple task. Each assembly instruction is translated into a 16-bit encoded value during the assembly process. Assembly instructions are written into a source file, and are always written away from the left margin (normally the space of one tab). If it's written on the left margin, the assembler won't recognise it as an instruction, and will report an error. Assembly instructions are made from three parts, an opcode, and two operands. An opcode is a simple english language keyword that tells the processor what to do. Examples of opcodes include: ADD, SUB, MOVE, TST and JMP. Operands are used to tell the processor what data will be used and effected by the instruction. Operands can be memory addresses, processor registers or immediate numbers. Here's an example or an assembly instruction: move.l #$43, $ff0000;The above instruction would move the number hexadecimal number 43 into the memory location $ff0000, which is the start of the Megadrive's RAM. Don't worry, I'll explain every part of this instruction. The opcode 'MOVE' is followed by '.l'. All instructions must tell the processor whether the data their working with is a byte (8 bits), a word (16-bits) or a long word (32-bits). This is done by suffixing the opcode with '.b', '.w' or '.l', respectively. The following instruction would move the same number, but as a byte: move.b #$43, $ff0000;The first operand, in this case the hex number 43, is called the source operand. In this example, this is the number that will be moved. The second operand, in this example an address, is the destination. Notice that in assembly, an immediate number is prefixed with a '#'. Numbers that aren't are treated as addresses. In this example, the second operand is an address, but the first is an immediate number. Notice also that because the number is a hexadecimal number, it's prefixed with a '$'. If the '$' wasn't there, the number will be treated as a normal decimal number. Assembly instructions can also use binary numbers as an operand, as long as the binary number is prefixed with a '%'. Example: move.b #%01101111, $ff0004;Many of the different assembly instructions are explained in greater detail in the next section.
All processors contain registers. Registers are simply very small memory chips which hold temporary data being processed by the CPU. Because registers are contained inside the processor and are so small, they are much faster to use than memory is. A good programmer uses registers instead of memory wherever possible. The M68000 has 8 general purpose registers, called data registers, and 8 address registers. All of them can hold 32-bits (a long word) of data. As well as data and address registers, there are also other registers which the programmer can use. Data registers are represented in assembly as 'd0', 'd1', 'd2', etc, up to 'd7'. These are used to store information that is currently being used by one or more instructions. For example, a programmer might want to use data stored at an address quite a few times. Rather than keep refering to the memory address, which would be slow, the programmer would copy the data into a data register, which would be much faster. Some assembly instructions will require data to be held in a data register. Address registers are quite like data regisers, and are represented as 'a0', 'a1', etc up to 'a7'. However, as the name suggests, address registers store addresses to locations where the actual data is stored. Simply, address regisers 'point' to information. This is useful because it allows memory addresses to be used as variables which can then be incremented, tested, compared, etc. I'll explain this in greater detail later. Address register 'a7' is a special address register. It is used to point to the system stack. The stack is a location in memory where data from many registers is stored temporarily, often while the registers are required for something else. The stack can hold information from many registers at once by working with a LIFO (Last In First Out) structure. This means that information put on last is the first to be copied out. This is where it gets it's name from. The Megadrive uses 64 Kilobytes of memory. As programmers, we can use the memory in the range of addresses $FF0000 - $FFFFFF for general use. The rest of the memory is used for specific hardware parts which are explained in later chapters.
The assembler converts our source code files into working Megadrive ROMs by using a three step process. During the first step of conversion, all assembler directives are interpreted, and all the different files are included into the main source file. During the second step, all of the assembly instructions are checked and varified, to make sure they are correct. In the third step, all of the assembly instructions are assembled into encoded machine code instructions, in the form of a list file. The list file is then converted into a binary ROM image using the 'p2bin.exe' program, which can then be run on an emulator, or copied onto a Megadrive cartridge. In the next 6 sections, I'll explain all of the main types of assembly instructions, and how to use them to perform a wide range of different operations. This part of the learning curve is very steep, but after you've understood these basics, we can move on to producing real working Megadrive programs. |