MASM Basics

Integer constants and expressions.

  • Optional leading + or - sign.

  • Binary, Decimal, Hexadecimal.

  • Common radix characters:

    • h => hexadecimal (Use as much as possible)

    • d => Decimal (When hex makes no sense)

    • b => Binary (For bitwise clarity)

    • r => Encoded real (Real) Examples: 30d, 6Ah, 42, 1101b

NOTE: Hexadecimal can't begin with a letter. => 0A2h

Expressions

As any programming language


Character and string

  • Enclose character in single or double quotes.

    • 'a' , "a" => (ASCII char == 1 byte).

  • Enclose strings in single or double quotes.

    • "Hello" , 'Hello' => (Each ASCII char == 1 byte).

  • Embedded quotes are allowed

    • 'Say "Abuqasem" is learning'

    • "This isn't a test"

NOTE: Each string must end with a '0' to tell the function to print until the zero. => "Hello world",0 (old assemblers used '$' instead of zero)


Reserved words and identifiers

  • Reserved words cannot be used as identifiers.

    • Instruction mnemonics, directives, type attributes, operators, predefined symbols.

  • Identifiers

    • 1-247 chars, including digits.

    • Not case sensitive.

    • First character must be a letter. => (_, @, ?, $)

    • Used for labels (Procedure names, variables, constants).


Directives (موجه)

  • Instructions on how to assemble (Not at runtime).

  • Commands that are recognized ad acted upon by the assembler.

    • Not part of the intel instruction set.

    • Used to declare code , data areas, select memory model, declare procedures, variables etc..

    • Not case sensitive (.data,.DATA,.DAta).

  • Different assemblers have different directives

    • GNU , netwide are not the same as MASM.

  • One important function of assembler directives is to define program sections, or segments

    • .data (define variables)

    • .code (write code)

    • .stack 100h


Labels

  • Act as a place markers

    • Marks the address (offset) of code ad data.

  • Follow identifier rules

  • Data label (Variable names)

    • Must be unique.

    • count DWORD 100 => (not followed by a colon)

  • Code label

    • Target of jump and loop instructions.

    • L1: => (Followed by a colon)

Mnemonic

  • No operands

    • stc => (set carry flag)

  • One operand

    • inc eax => (register)

    • inc myByte => (memory)

  • Two operands

    • add ebx,ecx => (register, register)

    • sub myByte,25 => (memory, constant)

    • add eax,36 * 25 => (register, const-expr)

NOP instruction

  • No Operation.

  • Uses 1 byte of storage.

  • CPU: Reads it, Decodes it, ignores it.

  • Used to allign code to even-address boundaries (multiple of 4):

  • x86 processors are designed to load code and data more quickly from even-doubleword addresses.


Intel instructions

  • Assembled into machine code by assembler and executed at runtime by CPU.

  • An instruction contains:

    • Label => (Optional)

    • Mnemonic => (Required)

    • Operands => (Depends on the instruction)

    • Comment => (Optional) begins with a ';'


Basic data types

  1. BYTE, SBYTE: 8-bit unsigned & signed integers.

  2. WORD, SWORD: 16-bit unsigned & signed integers.

  3. DWORD, SDWORD: 32-bit unsigned & signed integers.

  4. QWORD: 64-bit integer. => (Not signed/unsigned)

  5. TBYTE: 80-bit integer. => (ten byte)

  6. REAL4, REAL8: 4-byte & 8-byte long reals.

  7. REAL10: 10-byte IEEE extended real.

Legacy data directives

  1. DB: 8-bit integer.

  2. DW: 6-bit integer.

  3. DD: 32-bit integer or real.

  4. DQ: 64-bit integer or real.

  5. DT: 80-bit integer. => (ten bytes).

Data definition statement

  1. A data definition statement sets aside storage in memory for a variable.

  2. May optionally assign a name (label) to the data.

  • Syntax ![[variable statement.png]]

  1. Use the ? symbol for undefined variables.

  2. All initializers become binary data in memory.

Defining BYTE, SBYTE data

  • Each of the following defines a single byte of storage.

  1. Value1 BYTE 'A' => character constant.

  2. Value2 BYTE 0 => smallest unsigned byte.

  3. Value3 BYTE 255 => largest unsigned byte.

  4. Value4 SBYTE -128 => smallest signed byte.

  5. Value5 SBYTE +127 => largest signed byte.

  6. value6 BYTE ? => uninitialized byte.

  • The optional name is label marking the variable's offset from the beginning of it's enclosing segment.

  • If value1 is located at offset 0000 in the data segment and consumes 1 byte of storage, value2 is automatically located at offset 0001

  • If you declare a SBYTE variable, the microsoft debugger will automatically display it's value in decimal with a leading sign.

Defining Byte Arrays

![[offset.png]]

  1. An array is simply a set of sequential memory locations.

  2. The directive (BYTE) indicates the offset needed to get to the next array element.

  3. No length, no termination flag, no special properties.

Defining strings

  • A string is implemented as a sequence of characters.

    1. For convenience, it's usually enclosed in quotation marks.

    2. It's usually null terminated.

    3. Characters are bytes.

    4. Hex characters 0Dh(CR) and 0Ah(LF) are useful.

  • Example ![[define strings.png]]

Instructions

Outline

  1. Data Transfer Instructions

    • Operand types

    • MOV, MOVZX, MOVSX

    • LAHF, SAHF

    • XCHG

  2. Addition and Subtraction

    • INC, DEC

    • ADD, SUB

    • NEG

  3. Data-related operators and directives

  4. Indirect addressing

  5. JMP, LOOP

Operand types

  1. Immediate (constant integer(8,16,32 bits))

  2. Register (the name of register)

  3. Memory (reference to location in memory)

    • Memory address is encoded with the instruction, or a register holds the address of a memory location

Operand notation

Data Transfer Instructions

MOV instruction

  1. Move from source to destination

    • MOV destination, source

  2. Both operands must be the same size.

  3. No more than one memory operand permitted.

  4. CS, EIP, IP cannot be the destination.

  5. No immediate to segment registers moves.

  6. To MOV memory to memory.

  1. Direct memory operands

MOV errors

Zero extension

  • When you copy a smaller value into a larger destination, the MOVZX instruction fills (extends) the upper half of the destination with zeros. ![[zeroext.png]]

Sign extension

  • The MOVSX instruction fills the upper half of the destination with a copy of the source operand's sign bit. ![[signext.png]]

XCHG instruction

  1. XCHG exchanges the values of two operands.

  2. At least one operand must be a register.

  3. No immediate operands are permitted.

Examples

  1. There is no "range checking" - the address is calculated and used.

  2. Size of transfer is based on the destination.

Example 2

  • Write a program that adds the following three bytes:

Addition and Subtraction

INC and DEC instructions

  1. Add/Subtract 1 from operand (register/memory)

  2. INC destination => (e.g destination++)

  3. DEC destination => (e.g destination--)

ADD and SUB instructions

  1. ADD destination, source

  2. SUB destination, source

NOTE: Same operand rules as for the MOV instructions.

NEG (negate) instruction

  • Reverses the sign of an operand in a register/memory location (2nd complement).

Align directive

  • The ALIGN directive aligns a variable on a byte, word, doubleword, or a paragraph boundary:

PTR operator

  • Overrides the default type of a label (Variable)

  • Provides the flexibility to access part of a variable.

  • Requires a prefixed size specifier

  • Little Endian order (revise)

  • PTR example

  • Combine elements of a smaller data type into a larger operand

The CPU will automatically reverse the bytes

  • More examples

TYPE operator

  • Returns the size of a single element of a data declaration (in bytes).

LENGTHOF operator

  • Counts the number of elements in a single data declaration

SIZEOF operator

  • Equivalent of multiplying SIZEOF =LENGTHOF * TYPE

Multiple lines and anonymous data

  • Spanning multiple lines ![[spanningmultiplelines.png]]

  • Anonymous data ![[anonymousdata.png]]

LABEL directive

  • Assigns an alternate label name and type to an existing storage location.

  • Does not allocate any storage of it's own.

  • Avoids the need for the PTR operator.

dwList, wordList, intList are the same offset (address).

OFFSET operator

Used for indirect addressing

  • OFFSET returns the distance in bytes of a label from the beginning of it's enclosing segment.

  • Protected mode: 32 bits

  • Real mode: 16 bits

  • Example: Assume that bVal is located at offset 0040400h

  • Another example

ESI register

Is an indirect operand (Register as a pointer).

  • It holds the address of a variable, usually an array or a string.

  • It can be de-referenced (just like a pointer) using [ESI].

  • Works with OFFSET to produce the address to de-reference.

PTR for indirect addressing

  • Use it to clarify the size attribute of a memory operand

  • When we have an address (offset) we don't know the size of the values at that offset and must specify them explicitly.

Indirect operand (variable as a pointer)

  • Offsets are of size DWORD.

  • A variable if size DWORD can hold an offset.

  • i.e you can declare a pointer variable that contains the offset of another variable.

Array sum example

  • Indirect operands are ideal for traversing an array.

NOTE : The register in brackets must be incremented by a value that matches the array type (i.e 2 for WORD, 4 for DWORD, 8 for QWORD).

JMP instructions

  • Jumps are the basics of most control flow.

  • HLL compilers turn loops, if statements, switches etc. into same kind of jump.

  • JMP is an unconditional jump to a label that is usually within the same procedure.

  • Syntax: JMP target

  • Logic:EIP <- target

A jump outside the current procedure must be to a special type of label called a global label.

LOOP instruction

  • It creates a Counted loop using ECX

  • Syntax: LOOP target

  • Target should precede the instruction

    • ECX must contain the iteration count.

  • Logic:

    • ECX <- ECX -1

    • If ECX !=0 , jump back to target, else go to the next instruction.


References

Last updated