Lecture Computer organization and assembly language: Chapter 32 - Dr. Safdar Hussain Bouk

pptx
Số trang Lecture Computer organization and assembly language: Chapter 32 - Dr. Safdar Hussain Bouk 41 Cỡ tệp Lecture Computer organization and assembly language: Chapter 32 - Dr. Safdar Hussain Bouk 568 KB Lượt tải Lecture Computer organization and assembly language: Chapter 32 - Dr. Safdar Hussain Bouk 0 Lượt đọc Lecture Computer organization and assembly language: Chapter 32 - Dr. Safdar Hussain Bouk 0
Đánh giá Lecture Computer organization and assembly language: Chapter 32 - Dr. Safdar Hussain Bouk
4.1 ( 14 lượt)
Nhấn vào bên dưới để tải tài liệu
Để tải xuống xem đầy đủ hãy nhấn vào bên trên
Chủ đề liên quan

Nội dung

CSC 221 Computer Organization and Assembly Language Lecture 32: Intel x86 Instruction Encoding Lecture Outline • Encoding Real x86 Instructions • x86 Instruction Format Reference • x86 Opcode Sizes • x86 ADD Instruction Opcode • Encoding x86 Instruction Operands, MOD-REG-R/M Byte • REG Field of the MOD-REG-R/M Byte • MOD R/M Byte and Addressing Modes • SIB (Scaled Index Byte) • Layout Scaled Indexed Addressing Mode Lecture Outline • Encoding ADD Instruction Example • Encoding ADD CL, AL Instruction • Encoding ADD ECX, EAX Instruction • Encoding ADD EDX, DISPLACEMENT Instruction • Encoding ADD EDI, [EBX] Instruction • Encoding ADD EAX, [ ESI + disp8 ] Instruction • Encoding ADD EBX, [ EBP + disp32 ] Instruction • Encoding ADD EBP, [ disp32 + EAX*1 ] Instruction • Encoding ADD ECX, [ EBX + EDI*4 ] Instruction • Encoding ADD Immediate Instruction Encoding Real x86 Instructions • It is time to take a look that the actual machine instruction format of the x86 CPU family. • They don't call the x86 CPU a Complex Instruction Set Computer (CISC) for nothing! • Although more complex instruction encodings exist, no one is going to challenge that the x86 has a complex instruction encoding: Encoding Real x86 Instructions One or Two byte Instruction opcode Optional Scaled Index Byte if the (two bytes if the special 0Fh opcode expansion prefix is present) instruction uses a scaled index memory addressing mode. Imm./Constant data. This is a 0,1, 2, or 4 byte constant value if the instruction has an immediate operand. Prefix Bytes “mod-reg-r/m” byte that Displacement. This is 0 to 4 special prefix values that affect the operation of instruction. spcifies the addressing mode and Instruction operand size. 0,1, 2, or 4 byte value that specifies a memory address displacement for the instruction. This byte is only required if the instruction supports register or memory operands. Encoding Real x86 Instructions • Although the diagram seems to imply that instructions can be up to 16 bytes long, in actuality the x86 will not allow instructions greater than 15 bytes in length. • The prefix bytes are not the opcode expansion prefix discussed earlier - they are special bytes to modify the behavior of existing instructions. x86 Instruction Format Reference • Another view of the x86 instruction format: Number of Bytes 0 or 1 0 or 1 0 or 1 0 or 1 Instruction Prefix AddressSize Prefix OperandSize Prefix Segment Override (a) Optional Instruction Prefix Number of Bytes Mod 7 6 0 or 1 0 or 1 0 or 1 OpCode Mod-R/M SIB Reg/OpCode 5 4 3 1 0 7 0, 1, 2 or 4 Displacement Scale R/M 2 0, 1, 2 or 4 6 Immediate Index 5 Bits (b) General Instruction Format 4 Base 3 2 1 0 x86 Instruction Format Reference • Instructions have some combination of the following fields (but no instruction has all parts) – instruction prefix – sets certain options – opcode - specifies the operation to perform – Mod R/M - specifies addressing mode/operands – SIB (scale index base) - used for array index – address displacement - used for addressing memory – immediate value - holds value of a constant operand x86 Instruction Format Reference • Displacement • We are really talking about an address offset within a segment (usually given as a named variable or a label in code) – it could be a relative address like the 8-bit value used for jumping forward or backward from the current location in the code segment – or it could be the location of a variable in the data segment – or it could be a FAR reference to code or data in another segment x86 Instruction Format Reference • Displacement Examples • jmp next – where next is a label in the current code segment • add eax, var1 – where var1 is a 32-bit variable in the current data segment • sub bx, var2[ecx] – where var2 is a 16-bit variable in the current code segment and ecx is an index register x86 Instruction Format Reference • Immediate Values • These are usually constants used directly in the operation – available immediately – for example: ADD EAX, 7 where 7 is the immediate value – there is no variable name and no source register – or: CMP AL, [EDX+10] where 10 is a constant that will be added to the contents of the EDX register to get the operand’s location – or: SHL AX, 4 where the constant 4 is the number of bit positions to shift the AX register x86 Instruction Format Reference • Instruction Prefix • Used to specify options for instruction execution, for example: – when executing String operators (MOVS, SCAS, etc) the prefix is used to indicate that the operation should be repeated • for REP and REPE, the prefix is set to F3h • for REPNE, the prefix is F2h – some values indicate the memory segment that should be used (instead of the default) • for ES prefix is set to 26h, for FS it is 64h, etc. x86 Instruction Format Reference • Instruction Prefix • Used to specify ways that the instruction should be executed, for example: – to change the default data size for an instruction (from 32-bit to 16-bit or vice-versa), the prefix is set to 66h – similarly, to change the size of the default address size for an instruction, set it to 67h – to lock shared memory so that only this instruction has access, set it to F0h x86 Opcode Sizes • The x86 CPU supports two basic opcode sizes: – standard one-byte opcode – two-byte opcode consisting of a 0Fh opcode expansion prefix byte. – The second byte then specifies the actual instruction. • This provides for up to 512 different instruction classes, although the x86 does not yet use them all. Number of Bytes Mod 7 6 0 or 1 0 or 1 0 or 1 OpCode Mod-R/M SIB Reg/OpCode 5 4 3 1 0 7 0, 1, 2 or 4 Displacement Scale R/M 2 0, 1, 2 or 4 6 Immediate Index 5 Bits (b) General Instruction Format 4 Base 3 2 1 0 x86 ADD Instruction Opcode • Bit number zero marked s specifies the size of the operands the ADD instruction operates upon: • If s = 0 then the operands are 8-bit registers and memory locations. • If s = 1 then the operands are either 16-bits or 32-bits: – Under 32-bit operating systems the default is 32-bit operands if s = 1. – To specify a 16-bit operand (under Windows or Linux) you must insert a special operand-size prefix byte in front of the instruction. • x86 ADD instruction opcode : 0 0 0 0 d s ADD Opcode. d =0 if adding from register to memory. d =1 if adding from memory to register. s =0 if adding eight-bit operands. s =1 if adding 16/32-bit operands. Bit number one, marked d, specifies the direction of the data transfer: • If d = 0 then the destination operand is a memory location, e.g. • add [ebx], al • If d = 1 then the destination operand is a register, e.g. • add al, [ebx] Encoding x86 Instruction Operands, MOD-REG-R/M Byte (1/4) • The MOD-REG-R/M byte specifies instruction operands and their addressing mode(*): 7 6 Mod 5 4 3 Reg/ OpCode 2 1 0 R/M • The R/M field, combined with MOD, specifies either – the second operand in a two-operand instruction, or – the only operand in a single-operand instruction like NOT or NEG. • The d bit in the opcode determines which operand is the source, and which is the destination: – d=0: MOD R/M <- REG, REG is the source – d=1: REG <- MOD R/M, REG is the destination Technically, registers do not have an address, but we apply the term addressing mode to registers nonetheless. (*) Encoding x86 Instruction Operands, MOD-REG-R/M Byte (2/4) 7 6 Mod 5 4 3 Reg/ OpCode 2 1 0 R/M The MOD field specifies x86 addressing mode: MOD Meaning 00 Register indirect addressing mode or SIB with no displacement (when R/M = 100) or Displacement only addressing mode (when R/M = 101). 01 One-byte signed displacement follows addressing mode style byte(s). 10 Four-byte signed displacement follows addressing mode style byte(s). 11 Register addressing mode. Encoding x86 Instruction Operands, MOD-REG-R/M Byte (3/4) 7 6 Mod 5 4 3 Reg/ OpCode 2 1 0 R/M The REG field specifies source or destination register: • Depending on the instruction , this can be either the source or the destination operand. • Many instructions have the d (direction) field in their opcode to choose REG operand role: – If d=0, REG is the source, MOD R/M <- REG. – If d=1, REG is the destination, REG <- MOD R/M. For certain (often single-operand or immediate-operand) instructions, the REG field may contain an opcode extension rather than the register bits. The R/M field will specify the operand in such case. Encoding x86 Instruction Operands, MOD-REG-R/M Byte (3/4) 7 6 Mod 5 4 3 Reg/ OpCode 2 1 0 R/M The REG field specifies source or destination register: REG Value Register if data size is 8-bits Register if data size is 16-bits Register if data size is 32-bits 000 al ax eax 001 cl cx ecx 010 dl dx edx 011 bl bx ebx 100 ah sp esp 101 ch bp ebp 110 dh si esi 111 bh di edi MOD R/M Byte and Addressing Modes MOD R/M Addressing Mode === === ======================== 00 000 [ eax ] 01 000 [ eax + disp8 ] (1) 10 000 [ eax + disp32 ] 11 000 register (bh/di/edi) (2) 00 001 [ ecx ] 01 001 [ ecx + disp8 ] 10 001 [ ecx + disp32 ] 11 001 register (bh/di/edi) 00 010 [ edx ] 01 010 [ edx + disp8 ] 10 010 [ edx + disp32 ] 11 010 register (bh/di/edi) 00 011 [ ebx ] 01 011 [ ebx + disp8 ] 10 011 [ ebx + disp32 ] 11 011 register (bh/di/edi) 00 100 SIB Mode (3) 01 100 SIB + disp8 Mode 10 100 SIB + disp32 Mode 11 100 register (bh/di/edi) (1) Addressing modes with 8-bit displacement fall in the range 128..+127 and require only a single byte displacement after the opcode (Faster!) (2) The size bit in the opcode specifies 8 or 32-bit register size. To select a 16-bit register requires a prefix byte. (3) The so-called scaled indexed addressing modes, SIB = scaled index byte mode. MOD R/M Byte and Addressing Modes MOD R/M Addressing Mode === === ======================== 00 101 32-bit Disp-Only Mode (4) 01 101 [ ebp + disp8 ] 10 101 [ ebp + disp32 ] 11 101 register (bh/di/edi) 00 110 [ esi ] 01 110 [ esi + disp8 ] 10 110 [ esi + disp32 ] 11 110 register (bh/di/edi) 00 111 [ edi ] 01 111 [ edi + disp8 ] 10 111 [ edi + disp32 ] 11 111 register (bh/di/edi) (4) Note that there is no [ ebp ] addressing. It's slot is occupied by the 32-bit displacement only addressing mode. Intel decided that programmers can use [ ebp+ disp8 ] addressing mode instead, with its 8-bit displacement set equal to zero (instruction is a little longer, though.) SIB (Scaled Index Byte) Layout • Scaled indexed addressing mode uses the second byte (namely, SIB byte) that follows the MOD-REG-R/M byte in the instruction format. Scaled index byte layout: 7 6 Scale 5 4 Index 3 2 1 0 Base • The MOD field still specifies the displacement size of zero, one, or four bytes. – The MOD-REG-R/M and SIB bytes are complex, because Intel reused 16-bit addressing circuitry in the 32-bit mode, rather than simply abandoning the 16-bit format in the 32-bit mode. – There are good hardware reasons for this, but the end result is a complex scheme for specifying addressing modes in the opcodes. SIB (Scaled Index Byte) Layout Scaled index byte layout: 7 6 5 Scale 4 Index Scale Value Index * Scale Value 00 Index * 1 01 Index * 2 10 Index * 3 11 Index * 4 Index 000 001 010 011 100 101 110 111 Register EAX ECX EDX EBX Illegal EBP ESI EDI 3 2 1 0 Base Base Register 000 EAX 001 ECX 010 EDX 011 EBX 100 ESP 101 Disp. only if MODE=00 EBP if MOD=01 or 10 110 ESI 111 EDI Scaled Indexed Addressing Mode [ [ [ [ [ [ [ MOD reg32 + reg32 + reg32 + reg32 + reg32 + reg32 + reg32 + = 00 eax*n ebx*n ecx*n edx*n ebp*n esi*n edi*n ] ] ] ] ] ] ] [ [ [ [ [ [ [ MOD = 10 disp + reg32 disp + reg32 disp + reg32 disp + reg32 disp + reg32 disp + reg32 disp + reg32 + + + + + + + eax*n ebx*n ecx*n edx*n ebp*n esi*n edi*n ] ] ] ] ] ] ] MOD = 01 MOD = 00, BASE = 101 [ disp + reg8 + eax*n ] [ disp + eax*n ] Note: n = 1, 2, 4, or 8. [ disp + reg8 + ebx*n ] [ disp + ebx*n ] [ disp + reg8 + ecx*n ] [ disp + ecx*n ] In each scaled indexed [ disp + reg8 + edx*n ] [ disp + edx*n ] addressing mode the MOD [ disp + reg8 + ebp*n ] [ disp + ebp*n ] field in MOD-REG-R/M [ disp + reg8 + esi*n ] [ disp + esi*n ] byte specifies the size of [ disp + reg8 + edi*n ] [ disp + edi*n ] the displacement. It can MOD R/M Addressing Mode be zero, one, or four bytes: 00 100 SIB 01 100 SIB + disp8 10 100 SIB + disp32 The Base and Index fields of the SIB byte select the base and index registers, respectively. Example: Encoding ADD Instruction • The ADD opcode can be decimal 0, 1, 2, or 3, depending on the direction and size bits in the opcode: 0 0 0 0 0 0 d s ADD Opcode. d =0 if adding from register to memory. d =1 if adding from memory to register. s =0 if adding eight-bit operands. s =1 if adding 16/32-bit operands. • How could we encode various forms of the ADD instruction using different addressing modes? Encoding ADD CL, AL Instruction Zero indicates that we are adding Eight Bit values together 0 0 0 0 0 0 0 d 000000 Indicates s 0 11 Indicates that the R/M field is a register This field, along with the d bit in the opcode indicates that the destination filed is the CL register R/M MOD 1 1 the ADD Instruction. Zero indicates that we are adding the REG field to the R/M field 0 0 0 REG 0 0 1 This field along with the d bit in the opcode, indicates that the source filed is the AL register. ADD cl, al = 00 C1 Encoding ADD CL, AL Instruction 0 0 0 0 0 0 0 d s 0 MOD 1 1 R/M 0 0 0 REG 0 0 1 • Interesting side effect of the direction bit and the MODREG-R/M byte organization: some instructions can have two different opcodes, and both are legal! • For example, encoding of add cl, al could be 00 C1 (if d=0), or 02 C8, if d bit is set to 1. 0000 0010 1100 1000 0 2 C 8 • The possibility of opcode duality issue here applies to all instructions with two register operands. Encoding ADD ECX, EAX Instruction One indicates that we are adding 32-Bit values together 0 0 0 0 0 0 0 d 000000 Indicates s 1 11 Indicates that the R/M field is a register. This field, along with the d bit in the opcode indicates that the destination filed is the ECX register R/M MOD 1 1 the ADD Instruction. Zero indicates that we are adding the REG field to the R/M field 0 0 0 REG 0 0 1 This field along with the d bit in the opcode, indicates that the source filed is the EAX register. ADD ecx, eax = 01 C1 = 03 C8 Encoding ADD EDX, DISP Instruction One indicates that we are adding 32-Bit values together 0 0 0 0 0 0 1 d 000000 Indicates s 1 the ADD Instruction. One indicates that we are adding the R/M field to the REG field. The combination of MOD=00 and R/M=101indicates that this is the displacement-only addressing mode. R/M MOD 0 0 0 1 1 REG 1 This field, along with the d bit in the opcode, indicates that the destination filed is the EDX register. 0 1 DISP32 32-bit displacement follows the instruction. ADD edx, disp = 03, 1D , ww, xx, yy, zz L.O Byte H.O Byte Encoding ADD EDI, [EBX] Instruction One indicates that we are adding 32-Bit values together 0 0 0 0 0 0 1 d 000000 Indicates s 1 the ADD Instruction. One indicates that we are adding the R/M field to the REG field. 00 indicates a zero byte displacement. 011 indicates the use of [EBX] addressing mode. R/M MOD 0 0 1 1 1 REG 0 This field along with the d bit in the opcode, indicates that the destination filed is the EDI register. ADD edi, [ebx] = 03, 3B 1 1 Encoding ADD EAX, [ESI + disp8] Instruction One indicates that we are adding 32-Bit values together 0 0 0 0 0 0 1 d 000000 Indicates s 1 the ADD Instruction. One indicates that we are adding the R/M field to the REG field. 01 indicates a one byte displacement. 110 indicates the use of [ESI] addressing mode. R/M MOD 0 1 0 0 0 REG 1 1 This field along with the d bit in the opcode, indicates that the destination filed is the EAX register. ADD eax, [esi + disp8] = 03, 46, xx 0 DISP8 8-bit displacement follows the MOD-REG-R/M byte. Encoding ADD EBX, [EBP + disp32] Instruction One indicates that we are adding 32-Bit values together 0 0 0 0 0 0 1 d 000000 Indicates s 1 the ADD Instruction. One indicates that we are adding the R/M field to the REG field. MOD=10 indicates the use of a 32-bit displacement. R/M=101 indicates the use of [EBP]. R/M MOD 1 0 0 1 1 REG 1 This field along with the d bit in the opcode, indicates that the destination filed is the EBX register. 0 1 DISP32 32-bit displacement follows the instruction. ADD ebx, [ebp + disp32] = 03, 9D, ww,xx,yy,zz What is Scale Index Mode? • Mainly used to work with arrays • for example, base could hold the location of an array and index the offset of an element • or, the displacement field is the name of the array and the index is the offset of an element – multiply the index register by a scaling factor of 1, 2, 4 or 8 times to adjust for the size of operands of different types • an array of WORD requires a factor of 2 • an array of DWORD requires a factor of 4 Scale Index Addressing • Can use a displacement (variable name) with index (offset) and scaling factor (n) – displacement[index*n] • or, a base register (pointer to a variable) with an index (offset) and a scaling factor (n) – [base][index*n] • or, all of the above – displacement[base][index*n] Effective Address Calculation Effective Address = Base + (Index*Scale) + Displacement Encoding ADD EBP, [disp32 + EAX*1] Instruction One indicates that we are adding 32-Bit values together 0 0 0 0 0 0 1 d 000000 Indicates s 1 the ADD Instruction. One indicates that we are adding the R/M field to the REG field. MOD=00 & R/M=100 means disp32+reg*1 mode MOD 00 Base=101 means displacement only addressing mode. R/M 101 100 00 000 101 REG These two fields EBP is dest. register. select the EAX*1 scaled index mode. ADD ebp, [disp32 + eax*1] = 03, 2C, 05, ww,xx,yy,zz DISP32 Encoding ADD ECX, [EBX + EDI*4] Instruction One indicates that we are adding 32-Bit values together 0 0 0 0 0 0 1 d 000000 Indicates s 1 the ADD Instruction. One indicates that we are adding the R/M field to the REG field. MOD=00 & R/M=100 means SIB mode MOD 00 R/M SIB Base=011=EBX Base 001 100 10 111 011 REG These two fields ECX is dest. register. select the EDI*4 scaled index mode. ADD ecx, [ebx + edi * 4] = 03, 0C, BB Encoding ADD Immediate Instruction s=0 : 8-bit operands s=1 : 32-bit operands 1 0 0 0 0 0 x d 100000 Indicates that this is an immediate mode inst. MOD-R/M specify destination operand. s 1 MOD 00 R/M 000 101 REG OpCode Extension, 000 for ADD 0=Constant’s size = s 1=Constant is 1-byte operand Immediate. that is sign extended to the size of the operand. Optional one or two byte displacement (as specified by MOD-R/M) field Constant 8,16 or 32-bit constant follow the instruction. Summary • Encoding Real x86 Instructions • x86 Instruction Format Reference • x86 Opcode Sizes • x86 ADD Instruction Opcode • Encoding x86 Instruction Operands, MOD-REG-R/M Byte • REG Field of the MOD-REG-R/M Byte • MOD R/M Byte and Addressing Modes • SIB (Scaled Index Byte) • Layout Scaled Indexed Addressing Mode Summary • Encoding ADD Instruction Example • Encoding ADD CL, AL Instruction • Encoding ADD ECX, EAX Instruction • Encoding ADD EDX, DISPLACEMENT Instruction • Encoding ADD EDI, [EBX] Instruction • Encoding ADD EAX, [ ESI + disp8 ] Instruction • Encoding ADD EBX, [ EBP + disp32 ] Instruction • Encoding ADD EBP, [ disp32 + EAX*1 ] Instruction • Encoding ADD ECX, [ EBX + EDI*4 ] Instruction • Encoding ADD Immediate Instruction Reference • Instruction Format Design – http://www.c-jump.com/CIS77/CPU/IsaDesign/index.html • Encoding Real x86 Instructions – http://www.c-jump.com/CIS77/CPU/x86/ lecture.html#X77_0010_real_encoding
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.