MASM Basics

Integer constants and expressions.
Optional leading
+or-sign.Binary, Decimal, Hexadecimal.
Common radix characters:
h=> hexadecimal (Use as much as possible)d=> Decimal (When hex makes no sense)b=> Binary (For bitwise clarity)r=> Encoded real (Real) Examples:30d,6Ah,42,1101b
NOTE: Hexadecimal can't begin with a letter. => 0A2h
Expressions
As any programming language
Character and string
Enclose character in single or double quotes.
'a',"a"=> (ASCII char == 1 byte).
Enclose strings in single or double quotes.
"Hello",'Hello'=> (Each ASCII char == 1 byte).
Embedded quotes are allowed
'Say "Abuqasem" is learning'"This isn't a test"
NOTE: Each string must end with a '0' to tell the function to print until the zero. => "Hello world",0 (old assemblers used '$' instead of zero)
Reserved words and identifiers
Reserved words cannot be used as identifiers.
Instruction mnemonics, directives, type attributes, operators, predefined symbols.
Identifiers
1-247 chars, including digits.
Not case sensitive.
First character must be a letter. => (
_,@,?,$)Used for labels (Procedure names, variables, constants).
Directives (موجه)
Instructions on how to assemble (Not at runtime).
Commands that are recognized ad acted upon by the assembler.
Not part of the intel instruction set.
Used to declare code , data areas, select memory model, declare procedures, variables etc..
Not case sensitive (.data,.DATA,.DAta).
Different assemblers have different directives
GNU , netwide are not the same as MASM.
One important function of assembler directives is to define program sections, or segments
.data (define variables)
.code (write code)
.stack 100h
Labels
Act as a place markers
Marks the address (offset) of code ad data.
Follow identifier rules
Data label (Variable names)
Must be unique.
count DWORD 100=> (not followed by a colon)
Code label
Target of jump and loop instructions.
L1:=> (Followed by a colon)
Mnemonic
No operands
stc=> (set carry flag)
One operand
inc eax=> (register)inc myByte=> (memory)
Two operands
add ebx,ecx=> (register, register)sub myByte,25=> (memory, constant)add eax,36 * 25=> (register, const-expr)
NOP instruction
No Operation.
Uses 1 byte of storage.
CPU: Reads it, Decodes it, ignores it.
Used to allign code to even-address boundaries (multiple of 4):
x86 processors are designed to load code and data more quickly from even-doubleword addresses.
Intel instructions
Assembled into machine code by assembler and executed at runtime by CPU.
An instruction contains:
Label => (Optional)
Mnemonic => (Required)
Operands => (Depends on the instruction)
Comment => (Optional) begins with a ';'
Basic data types
BYTE,SBYTE: 8-bit unsigned & signed integers.WORD,SWORD: 16-bit unsigned & signed integers.DWORD,SDWORD: 32-bit unsigned & signed integers.QWORD: 64-bit integer. => (Not signed/unsigned)TBYTE: 80-bit integer. => (ten byte)REAL4,REAL8: 4-byte & 8-byte long reals.REAL10: 10-byte IEEE extended real.
Legacy data directives
DB: 8-bit integer.DW: 6-bit integer.DD: 32-bit integer or real.DQ: 64-bit integer or real.DT: 80-bit integer. => (ten bytes).
Data definition statement
A data definition statement sets aside storage in memory for a variable.
May optionally assign a name (label) to the data.
Syntax ![[variable statement.png]]
Use the
?symbol for undefined variables.All initializers become binary data in memory.
Defining BYTE, SBYTE data
Each of the following defines a single byte of storage.
Value1 BYTE 'A'=> character constant.Value2 BYTE 0=> smallest unsigned byte.Value3 BYTE 255=> largest unsigned byte.Value4 SBYTE -128=> smallest signed byte.Value5 SBYTE +127=> largest signed byte.value6 BYTE ?=> uninitialized byte.
The optional name is label marking the variable's offset from the beginning of it's enclosing segment.
If value1 is located at
offset 0000in the data segment and consumes 1 byte of storage, value2 is automatically located atoffset 0001If you declare a
SBYTEvariable, the microsoft debugger will automatically display it's value in decimal with a leading sign.
Defining Byte Arrays
![[offset.png]]
An array is simply a set of sequential memory locations.
The directive (BYTE) indicates the offset needed to get to the next array element.
No length, no termination flag, no special properties.
Defining strings
A string is implemented as a sequence of characters.
For convenience, it's usually enclosed in quotation marks.
It's usually null terminated.
Characters are bytes.
Hex characters
0Dh(CR) and0Ah(LF) are useful.
Example ![[define strings.png]]
Instructions
Outline
Data Transfer Instructions
Operand types
MOV,MOVZX,MOVSXLAHF,SAHFXCHG
Addition and Subtraction
INC,DECADD,SUBNEG
Data-related operators and directives
Indirect addressing
JMP,LOOP
Operand types
Immediate (constant integer(8,16,32 bits))
Register (the name of register)
Memory (reference to location in memory)
Memory address is encoded with the instruction, or a register holds the address of a memory location
Operand notation

Data Transfer Instructions
MOV instruction
Move from source to destination
MOV destination, source
Both operands must be the same size.
No more than one memory operand permitted.
CS,EIP,IPcannot be the destination.No immediate to segment registers moves.
To
MOVmemory to memory.
Direct memory operands
MOV errors
Zero extension
When you copy a smaller value into a larger destination, the
MOVZXinstruction fills (extends) the upper half of the destination with zeros. ![[zeroext.png]]
Sign extension
The
MOVSXinstruction fills the upper half of the destination with a copy of the source operand's sign bit. ![[signext.png]]
XCHG instruction
XCHG exchanges the values of two operands.
At least one operand must be a register.
No immediate operands are permitted.
Examples
There is no "range checking" - the address is calculated and used.
Size of transfer is based on the destination.
Example 2
Write a program that adds the following three bytes:
Addition and Subtraction
INC and DEC instructions
Add/Subtract 1 from operand (register/memory)
INCdestination => (e.g destination++)DECdestination => (e.g destination--)
ADD and SUB instructions
ADDdestination, sourceSUBdestination, source
NOTE: Same operand rules as for the MOV instructions.
NEG (negate) instruction
Reverses the sign of an operand in a register/memory location (2nd complement).

Data related operators and directives
Align directive
The
ALIGNdirective aligns a variable on a byte, word, doubleword, or a paragraph boundary:
PTR operator
Overrides the default type of a label (Variable)
Provides the flexibility to access part of a variable.
Requires a prefixed size specifier
Little Endian order (revise)

PTR example

Combine elements of a smaller data type into a larger operand
The CPU will automatically reverse the bytes
More examples
TYPE operator
Returns the size of a single element of a data declaration (in bytes).
LENGTHOF operator
Counts the number of elements in a single data declaration
SIZEOF operator
Equivalent of multiplying
SIZEOF =LENGTHOF * TYPE
Multiple lines and anonymous data
Spanning multiple lines ![[spanningmultiplelines.png]]
Anonymous data ![[anonymousdata.png]]
LABEL directive
Assigns an alternate label name and type to an existing storage location.
Does not allocate any storage of it's own.
Avoids the need for the PTR operator.
dwList,wordList,intListare the same offset (address).
OFFSET operator
Used for indirect addressing
OFFSETreturns the distance in bytes of a label from the beginning of it's enclosing segment.Protected mode: 32 bitsReal mode: 16 bits

Example: Assume that
bValis located at offset0040400h
Another example
ESI register
Is an indirect operand (Register as a pointer).
It holds the address of a variable, usually an array or a string.
It can be de-referenced (just like a pointer) using
[ESI].Works with
OFFSETto produce the address to de-reference.
PTR for indirect addressing
Use it to clarify the size attribute of a memory operand
When we have an address (offset) we don't know the size of the values at that offset and must specify them explicitly.
Indirect operand (variable as a pointer)
Offsets are of size
DWORD.A variable if size
DWORDcan hold an offset.i.e you can declare a pointer variable that contains the offset of another variable.
Array sum example
Indirect operands are ideal for traversing an array.
NOTE: The register in brackets must be incremented by a value that matches the array type (i.e 2 for WORD, 4 for DWORD, 8 for QWORD).
JMP instructions
Jumps are the basics of most control flow.
HLL compilers turn loops, if statements, switches etc. into same kind of jump.
JMP is an
unconditional jumpto a label that is usually within the same procedure.Syntax: JMP target
Logic:
EIP <- target
A jump outside the current procedure must be to a special type of label called a
globallabel.
LOOP instruction
It creates a
Counted loopusingECXSyntax: LOOP target
Target should precede the instruction
ECXmust contain the iteration count.
Logic:
ECX <- ECX -1If
ECX !=0, jump back to target, else go to the next instruction.
References
Last updated