x86 Instructions
975 instructions — click any row to view encoding, pseudocode, and full documentation.
| Mnemonic | Syntax | Format | Extension | Summary |
|---|---|---|---|---|
| aaa | AAA | Legacy | Base (Legacy) | Adjusts AL after addition for unpacked BCD. |
| aad | AAD imm8 | Legacy | Base (Legacy) | Adjusts AX before division for unpacked BCD. |
| aadd | AADD m32, r32 | VEX | RAO-INT | Atomically adds a value to a remote memory location. |
| aam | AAM imm8 | Legacy | Base (Legacy) | Adjusts AX after multiply for unpacked BCD. |
| aand | AAND m32, r32 | VEX | RAO-INT | Atomically ANDs a value to a remote memory location. |
| aas | AAS | Legacy | Base (Legacy) | Adjusts AL after subtraction for unpacked BCD. |
| adc | ADC r/m, r | Legacy | Base | Adds operands and the Carry Flag (CF). |
| adcx | ADCX r32, r/m32 | Legacy | ADX | Adds with Carry Flag (distinct from ADC, affects CF only). |
| add | ADD r/m, r | Legacy | Base | Adds source to destination. |
| add | ADD r/m, r | Legacy | Base | Adds src to dest and stores result in dest. |
| addpd | ADDPD xmm, xmm/m128 | SSE2 | SSE2 | Adds two 64-bit doubles. |
| addps | ADDPS xmm, xmm/m128 | SSE | SSE | Adds four 32-bit floats. |
| addsd | ADDSD xmm, xmm/m64 | SSE2 | SSE2 | Adds the low 64-bit double. |
| addsd | ADDSD xmm1, xmm2/m64 | SSE2 | SSE2 | Adds the low 64-bit double from source to destination. |
| addss | ADDSS xmm, xmm/m32 | SSE | SSE | Adds the low 32-bit float. |
| addss | ADDSS xmm1, xmm2/m32 | SSE | SSE | Adds the low 32-bit float from source to destination. |
| addsubpd | ADDSUBPD xmm1, xmm2/m128 | SSE3 | SSE3 | Adds odd elements, subtracts even elements (Double). |
| addsubps | ADDSUBPS xmm1, xmm2/m128 | SSE3 | SSE3 | Adds odd elements, subtracts even elements (Complex Math). |
| adox | ADOX r32, r/m32 | Legacy | ADX | Adds with Overflow Flag (Parallel addition with ADCX). |
| aesdec | AESDEC xmm1, xmm2/m128 | AES-NI | AES-NI | Performs one round of AES decryption flow. |
| aesdec128kl | AESDEC128KL m128, xmm | Legacy | KEYLOCKER | Decrypts data using Key Locker handle. |
| aesdec256kl | AESDEC256KL m128, xmm | Legacy | KEYLOCKER | Decrypts data using 256-bit Key Locker handle. |
| aesdecwide128kl | AESDECWIDE128KL m128 | Legacy | KEYLOCKER_WIDE | Decrypts 8 blocks using 128-bit Key Locker handle. |
| aesdecwide256kl | AESDECWIDE256KL m128 | Legacy | KEYLOCKER_WIDE | Decrypts 8 blocks using 256-bit Key Locker handle. |
| aesenc | AESENC xmm1, xmm2/m128 | AES-NI | AES-NI | Performs one round of AES encryption flow. |
| aesenc128kl | AESENC128KL m128, xmm | Legacy | KEYLOCKER | Encrypts data using Key Locker handle. |
| aesenc256kl | AESENC256KL m128, xmm | Legacy | KEYLOCKER | Encrypts data using 256-bit Key Locker handle. |
| aesenclast | AESENCLAST xmm1, xmm2/m128 | AES-NI | AES-NI | Performs the last round of AES encryption. |
| aesencwide128kl | AESENCWIDE128KL m128 | Legacy | KEYLOCKER_WIDE | Encrypts 8 blocks using 128-bit Key Locker handle. |
| aesencwide256kl | AESENCWIDE256KL m128 | Legacy | KEYLOCKER_WIDE | Encrypts 8 blocks using 256-bit Key Locker handle. |
| aesencwide256kl | AESENCWIDE256KL m128 | Legacy | KEYLOCKER_WIDE | Encrypts 8 blocks using 256-bit Key Locker handle. |
| aesimc | AESIMC xmm1, xmm2/m128 | AES-NI | AES-NI | Performs AES InvMixColumns transformation (decryption helper). |
| aeskeygenassist | AESKEYGENASSIST xmm1, xmm2/m128, imm8 | AES-NI | AES-NI | Generates round key for AES encryption. |
| and | AND r/m, r | Legacy | Base | Performs bitwise AND. |
| and | AND r/m, r | Legacy | Base | Performs bitwise AND. |
| andn | ANDN r32, r32, r/m32 | VEX | BMI1 | Calculates (NOT src1) AND src2. Non-destructive. |
| andpd | ANDPD xmm, xmm/m128 | SSE2 | SSE2 | Bitwise AND of 128 bits. |
| andps | ANDPS xmm, xmm/m128 | SSE | SSE | Bitwise AND of 128 bits. |
| aor | AOR m32, r32 | VEX | RAO-INT | Atomically ORs a value to a remote memory location. |
| arpl | ARPL r/m16, r16 | System | System (32-bit) | Adjusts RPL of selector to match current CPL (Legacy). |
| axor | AXOR m32, r32 | VEX | RAO-INT | Atomically XORs a value to a remote memory location. |
| bextr | BEXTR r32, r/m32, r32 | VEX | BMI1 | Extracts sequence of bits from source using index/length. |
| blcfill | BLCFILL r32, r/m32 | TBM | TBM | Sets all bits below the lowest clear bit (x & (x+1)). |
| blci | BLCI r32, r/m32 | TBM | TBM | Sets all bits to 0 except the lowest set bit inverted (x | ~(x+1)). |
| blcic | BLCIC r32, r/m32 | TBM | TBM | Isolates lowest clear bit (~x & (x+1)). |
| blcmsk | BLCMSK r32, r/m32 | TBM | TBM | Creates mask from lowest clear bit (x ^ (x+1)). |
| blcs | BLCS r32, r/m32 | TBM | TBM | Sets lowest clear bit (x | (x+1)). |
| blendpd | BLENDPD xmm1, xmm2/m128, imm8 | SSE4.1 | SSE4.1 | Selects doubles from two sources based on immediate mask. |
| blendps | BLENDPS xmm1, xmm2/m128, imm8 | SSE4.1 | SSE4.1 | Selects floats from two sources based on immediate mask. |
| blendvpd | BLENDVPD xmm1, xmm2/m128, <XMM0> | SSE4.1 | SSE4.1 | Blends doubles based on variable mask in XMM0. |
| blendvps | BLENDVPS xmm1, xmm2/m128, <XMM0> | SSE4.1 | SSE4.1 | Blends floats based on variable mask in XMM0. |
| blsfill | BLSFILL r32, r/m32 | TBM | TBM | Sets all bits below lowest set bit ((x-1) | x). |
| blsi | BLSI r32, r/m32 | VEX | BMI1 | Extracts the lowest set bit (x & -x). |
| blsic | BLSIC r32, r/m32 | TBM | TBM | Isolates lowest set bit and complements (~x | (x-1)). |
| blsmsk | BLSMSK r32, r/m32 | VEX | BMI1 | Creates mask up to lowest set bit (x ^ (x-1)). |
| blsr | BLSR r32, r/m32 | VEX | BMI1 | Clears the lowest set bit (x & (x-1)). |
| bndcl | BNDCL b, r/m | Legacy | MPX | Checks if address is within lower bound. |
| bndcu | BNDCU b, r/m | Legacy | MPX | Checks if address is within upper bound. |
| bndmk | BNDMK b, m | Legacy | MPX | Creates bounds data for MPX. |
| bndmov | BNDMOV b, b/m | Legacy | MPX | Moves MPX bounds data. |
| bound | BOUND r, m | Legacy | Base (32-bit only) | Checks if operand is within bounds defined in memory. |
| bsf | BSF r, r/m | Legacy | Base | Scans for LSB set to 1. |
| bsr | BSR r, r/m | Legacy | Base | Scans for MSB set to 1. |
| bswap | BSWAP r32 | Legacy | Base | Reverses the byte order of a register (Endian swap). |
| bswap | BSWAP r | Legacy | Base | Reverses the byte order of a 32/64-bit register. |
| bt | BT r/m, r | Legacy | Base | Selects a bit and stores it in CF. |
| btc | BTC r/m, r | Legacy | Base | Stores bit in CF and complements the bit. |
| btr | BTR r/m, r | Legacy | Base | Stores bit in CF and clears bit to 0. |
| bts | BTS r/m, r | Legacy | Base | Stores bit in CF and sets bit to 1. |
| bzhi | BZHI r32, r/m32, r32 | VEX | BMI2 | Clears high bits starting at index. |
| call | CALL rel | Legacy | Base | Push EIP/RIP and jump to target. |
| cbw | CBW | Legacy | Base | Sign-extends AL into AX. |
| clac | CLAC | Legacy | SMAP | Clears Alignment Check flag (SMAP prevention). |
| clc | CLC | Legacy | Base | Sets the CF flag to 0. |
| cld | CLD | Legacy | Base | Sets DF to 0 (String operations increment). |
| cldemote | CLDEMOTE m8 | Legacy | CLDEMOTE | Hints to move cache line to lower cache level. |
| clflush | CLFLUSH m8 | SSE2 | SSE2 | Flushes the cache line containing the operand from all caches. |
| clflushopt | CLFLUSHOPT m8 | Legacy | CLFLUSHOPT | Optimized version of CLFLUSH (Higher throughput). |
| clgi | CLGI | SVM | SVM | Disables global interrupts (AMD SVM). |
| cli | CLI | Legacy | Base | Disables maskable hardware interrupts. |
| clrssbsy | CLRSSBSY m64 | Legacy | CET-SS | Clears the busy flag in the shadow stack token. |
| cltd | CLTD | Legacy | Base | Sign-extends EAX into EDX:EAX (also CDQ). |
| clts | CLTS | System | System | Clears the TS flag in CR0 (Privileged). |
| clui | CLUI | Legacy | UINTR | Clears the User Interrupt Flag (UIF). |
| clwb | CLWB m8 | Legacy | CLWB | Writes back modified cache line without flushing (Persistent Memory). |
| clzero | CLZERO | AMD | CLZERO | Clears the cache line at address RAX/EAX (AMD). |
| cmc | CMC | Legacy | Base | Toggles the CF flag. |
| cmovcc | CMOVcc r, r/m | Legacy | CMOV | Moves data if condition code is met (e.g., CMOVE, CMOVNE). |
| cmovg | CMOVG r, r/m | Legacy | CMOV | Move if ZF=0 and SF=OF. |
| cmovge | CMOVGE r, r/m | Legacy | CMOV | Move if SF=OF. |
| cmovl | CMOVL r, r/m | Legacy | CMOV | Move if SF!=OF. |
| cmovle | CMOVLE r, r/m | Legacy | CMOV | Move if ZF=1 or SF!=OF. |
| cmovnz | CMOVNZ r, r/m | Legacy | CMOV | Move if ZF=0. |
| cmovz | CMOVZ r, r/m | Legacy | CMOV | Move if ZF=1. |
| cmp | CMP r/m, r | Legacy | Base | Subtracts src from dest and updates flags (dest not modified). |
| cmp | CMP r/m, r | Legacy | Base | Subtracts src from dest and updates flags (dest not modified). |
| cmpccxadd | CMPccXADD m32, r32, r32 | EVEX | CMPccXADD | Atomically adds if condition is met. |
| cmps | CMPSB | Legacy | Base | Compares byte/word at [ESI] with [EDI]. |
| cmpsd | CMPSD xmm1, xmm2/m64, imm8 | SSE2 | SSE2 | Compares low double-precision values and returns mask. |
| cmpsd | CMPSD | Legacy | Base | Compares doubleword at [ESI] with [EDI]. |
| cmpsq | CMPSQ | Legacy | Base (64-bit) | Compares quadword at [RSI] with [RDI]. |
| cmpss | CMPSS xmm1, xmm2/m32, imm8 | SSE | SSE | Compares low single-precision values and returns mask. |
| cmpsw | CMPSW | Legacy | Base | Compares word at [ESI] with [EDI]. |
| cmpxchg | CMPXCHG r/m, r | Legacy | Base | Compares accumulator with dest; if equal, dest = src; else accumulator = dest. |
| cmpxchg16b | CMPXCHG16B m128 | Base (64-bit) | Base (64-bit) | Atomically compares 128-bit memory with RDX:RAX. |
| cmpxchg8b | CMPXCHG8B m64 | Legacy | Base | Atomically compares EDX:EAX with memory; swaps if equal. |
| comisd | COMISD xmm1, xmm2/m64 | SSE2 | SSE2 | Compares low double and sets EFLAGS (Signaling NaN raises exception). |
| comiss | COMISS xmm1, xmm2/m32 | SSE | SSE | Compares low float and sets EFLAGS (Signaling NaN raises exception). |
| cpuid | CPUID | Legacy | Base | Returns processor information based on EAX value. |
| cqto | CQTO | Legacy | Base (64-bit) | Sign-extends RAX into RDX:RAX (also CQO). |
| crc32 | CRC32 r32, r/m | SSE4.2 | SSE4.2 | Accumulates CRC32C value using polynomial 0x11EDC6F41. |
| cvtdq2pd | CVTDQ2PD xmm1, xmm2/m64 | SSE2 | SSE2 | Converts two 32-bit integers to two 64-bit doubles. |
| cvtdq2ps | CVTDQ2PS xmm1, xmm2/m128 | SSE2 | SSE2 | Converts four 32-bit integers to floats. |
| cvtpd2dq | CVTPD2DQ xmm1, xmm2/m128 | SSE2 | SSE2 | Converts two doubles to two 32-bit integers (Rounded). |
| cvtpd2ps | CVTPD2PS xmm1, xmm2/m128 | SSE2 | SSE2 | Converts two doubles to two floats. |
| cvtps2dq | CVTPS2DQ xmm1, xmm2/m128 | SSE2 | SSE2 | Converts four floats to 32-bit integers (Rounded). |
| cvtps2pd | CVTPS2PD xmm1, xmm2/m64 | SSE2 | SSE2 | Converts lower two floats to doubles. |
| cvtsd2si | CVTSD2SI r32, xmm/m64 | SSE2 | SSE2 | Converts low double to integer (Rounded according to MXCSR). |
| cvtsd2sq | CVTSD2SQ r64, xmm/m64 | SSE2 | Base (64-bit) | Converts double to 64-bit integer (Rounded). |
| cvtsd2ss | CVTSD2SS xmm, xmm/m64 | SSE2 | SSE2 | Converts double to float. |
| cvtsi2sd | CVTSI2SD xmm, r/m32 | SSE2 | SSE2 | Converts 32-bit int to double. |
| cvtsi2ss | CVTSI2SS xmm, r/m32 | SSE | SSE | Converts 32-bit int to float. |
| cvtsq2sd | CVTSQ2SD xmm1, r/m64 | SSE2 | Base (64-bit) | Converts 64-bit integer to double. |
| cvtsq2ss | CVTSQ2SS xmm1, r/m64 | SSE | Base (64-bit) | Converts 64-bit integer to float. |
| cvtss2sd | CVTSS2SD xmm, xmm/m32 | SSE2 | SSE2 | Converts float to double. |
| cvtss2si | CVTSS2SI r32, xmm/m32 | SSE | SSE | Converts low float to integer (Rounded according to MXCSR). |
| cvtss2sq | CVTSS2SQ r64, xmm/m32 | SSE | Base (64-bit) | Converts float to 64-bit integer (Rounded). |
| cvttpd2dq | CVTTPD2DQ xmm1, xmm2/m128 | SSE2 | SSE2 | Converts two doubles to two 32-bit integers (Truncated). |
| cvttps2dq | CVTTPS2DQ xmm1, xmm2/m128 | SSE2 | SSE2 | Converts four floats to 32-bit integers (Truncated). |
| cvttps2pi | CVTTPS2PI mm, xmm/m64 | SSE | SSE | Converts packed floats to packed MMX integers (Truncate). |
| cvttsd2si | CVTTSD2SI r32, xmm/m64 | SSE2 | SSE2 | Converts double to 32-bit int (Truncate). |
| cvttsd2sq | CVTTSD2SQ r64, xmm/m64 | SSE2 | Base (64-bit) | Converts double to 64-bit integer (Truncated). |
| cvttss2si | CVTTSS2SI r32, xmm/m32 | SSE | SSE | Converts float to 32-bit int (Truncate). |
| cvttss2sq | CVTTSS2SQ r64, xmm/m32 | SSE | Base (64-bit) | Converts float to 64-bit integer (Truncated). |
| cwd | CWD | Legacy | Base | Sign-extends AX into DX:AX. |
| cwtl | CWTL | Legacy | Base | Sign-extends AX into EAX (also CWDE). |
| daa | DAA | Legacy | Base (Legacy) | Adjusts AL after addition for packed BCD. |
| das | DAS | Legacy | Base (Legacy) | Adjusts AL after subtraction for packed BCD. |
| dec | DEC r/m | Legacy | Base | Decrements the operand by 1. |
| div | DIV r/m | Legacy | Base | Unsigned divide (AX / src). |
| divpd | DIVPD xmm, xmm/m128 | SSE2 | SSE2 | Divides two 64-bit doubles. |
| divps | DIVPS xmm, xmm/m128 | SSE | SSE | Divides four 32-bit floats. |
| divsd | DIVSD xmm1, xmm2/m64 | SSE2 | SSE2 | Divides the low double-precision floating-point value. |
| divss | DIVSS xmm1, xmm2/m32 | SSE | SSE | Divides the low single-precision floating-point value. |
| dppd | DPPD xmm1, xmm2/m128, imm8 | SSE4.1 | SSE4.1 | Computes the dot product of two double vectors. |
| dpps | DPPS xmm1, xmm2/m128, imm8 | SSE4.1 | SSE4.1 | Computes the dot product of two float vectors. |
| emms | EMMS | MMX | MMX | Clears the FPU tag word to allow FP instructions after MMX. |
| encls | ENCLS | Legacy | SGX | Executes an SGX supervisor function specified by EAX. |
| enclu | ENCLU | Legacy | SGX | Executes an SGX user function specified by EAX. |
| encodekey128 | ENCODEKEY128 r32, r32 | Legacy | KEYLOCKER | Wraps a 128-bit AES key into a handle. |
| encodekey256 | ENCODEKEY256 r32, r32 | Legacy | KEYLOCKER | Wraps a 256-bit AES key into a handle. |
| endbr32 | ENDBR32 | Legacy | CET-IBT | Marker instruction for Indirect Branch Tracking (IBT). |
| endbr64 | ENDBR64 | Legacy | CET-IBT | Marker instruction for Indirect Branch Tracking (IBT). |
| enqcmd | ENQCMD r32, m512 | Legacy | ENQCMD | Writes a command to a device (DSA/IAA accelerator). |
| enqcmds | ENQCMDS r32, m512 | Legacy | ENQCMD | Writes a command to a device (Supervisor mode). |
| enter | ENTER imm16, imm8 | Legacy | Base | Creates a stack frame for procedure parameters. |
| erets | ERETS | Legacy | FRED | Returns from an event handler to supervisor mode (FRED). |
| eretu | ERETU | Legacy | FRED | Returns from an event handler to user mode (FRED). |
| extractps | EXTRACTPS r32/m32, xmm1, imm8 | SSE4.1 | SSE4.1 | Extracts a single float from XMM to an integer register. |
| extrq | EXTRQ xmm1, xmm2 | SSE4a | SSE4a | Extracts bit field from register (AMD SSE4a). |
| f2xm1 | F2XM1 | Legacy | x87 FPU | Computes (2^ST(0)) - 1. |
| fabs | FABS | Legacy | x87 FPU | Replaces ST(0) with its absolute value. |
| fadd | FADD m32fp/m64fp | Legacy | x87 FPU | Adds src to dest (ST(0) += src). |
| fchs | FCHS | Legacy | x87 FPU | Reverses the sign of ST(0). |
| fclex | FCLEX | Legacy | x87 FPU | Clears floating-point exception flags. |
| fcmovb | FCMOVB ST(0), ST(i) | Legacy | x87 FPU (P6+) | Moves ST(i) to ST(0) if CF=1. |
| fcmovbe | FCMOVBE ST(0), ST(i) | Legacy | x87 FPU (P6+) | Moves ST(i) to ST(0) if CF=1 or ZF=1. |
| fcmove | FCMOVE ST(0), ST(i) | Legacy | x87 FPU (P6+) | Moves ST(i) to ST(0) if ZF=1. |
| fcmovnb | FCMOVNB ST(0), ST(i) | Legacy | x87 FPU (P6+) | Moves ST(i) to ST(0) if CF=0. |
| fcmovnbe | FCMOVNBE ST(0), ST(i) | Legacy | x87 FPU (P6+) | Moves ST(i) to ST(0) if CF=0 and ZF=0. |
| fcmovne | FCMOVNE ST(0), ST(i) | Legacy | x87 FPU (P6+) | Moves ST(i) to ST(0) if ZF=0. |
| fcmovnu | FCMOVNU ST(0), ST(i) | Legacy | x87 FPU (P6+) | Moves ST(i) to ST(0) if PF=0. |
| fcmovu | FCMOVU ST(0), ST(i) | Legacy | x87 FPU (P6+) | Moves ST(i) to ST(0) if PF=1. |
| fcom | FCOM m32fp/m64fp | Legacy | x87 FPU | Compares ST(0) with source. |
| fcomi | FCOMI ST(0), ST(i) | Legacy | x87 FPU (P6+) | Compares ST(0) with ST(i) and sets CPU EFLAGS directly. |
| fcos | FCOS | Legacy | x87 FPU | Computes cosine of ST(0) (in radians). |
| fdecstp | FDECSTP | Legacy | x87 FPU | Decrements the TOP field in the FPU status word. |
| fdiv | FDIV m32fp/m64fp | Legacy | x87 FPU | Divides dest by src. |
| ffree | FFREE ST(i) | Legacy | x87 FPU | Sets the tag for ST(i) to empty. |
| fild | FILD m16int/m32int/m64int | Legacy | x87 FPU | Converts integer in memory to double-extended-precision float and pushes to ST(0). |
| fincstp | FINCSTP | Legacy | x87 FPU | Increments the TOP field in the FPU status word. |
| finit | FINIT | Legacy | x87 FPU | Resets FPU to default state. |
| fist | FIST m16int/m32int | Legacy | x87 FPU | Converts ST(0) to integer and stores in memory. |
| fistp | FISTP m16int/m32int/m64int | Legacy | x87 FPU | Converts ST(0) to integer, stores in memory, and pops stack. |
| fld | FLD m32fp/m64fp/m80fp | Legacy | x87 FPU | Pushes a floating-point value onto the FPU register stack (ST0). |
| fld1 | FLD1 | Legacy | x87 FPU | Pushes +1.0 onto the FPU register stack. |
| fldcw | FLDCW m2byte | Legacy | x87 FPU | Loads FPU control word from memory. |
| fldl2e | FLDL2E | Legacy | x87 FPU | Pushes log2(e) onto the FPU register stack. |
| fldl2t | FLDL2T | Legacy | x87 FPU | Pushes log2(10) onto the FPU register stack. |
| fldlg2 | FLDLG2 | Legacy | x87 FPU | Pushes log10(2) onto the FPU register stack. |
| fldln2 | FLDLN2 | Legacy | x87 FPU | Pushes ln(2) onto the FPU register stack. |
| fldpi | FLDPI | Legacy | x87 FPU | Pushes Pi onto the FPU register stack. |
| fldz | FLDZ | Legacy | x87 FPU | Pushes +0.0 onto the FPU register stack. |
| fmul | FMUL m32fp/m64fp | Legacy | x87 FPU | Multiplies dest by src. |
| fpatan | FPATAN | Legacy | x87 FPU | Computes arctan(ST(1)/ST(0)). |
| fprem | FPREM | Legacy | x87 FPU | Computes remainder of ST(0) / ST(1). |
| fptan | FPTAN | Legacy | x87 FPU | Computes tangent of ST(0) and pushes 1.0. |
| frndint | FRNDINT | Legacy | x87 FPU | Rounds ST(0) to integer according to RC field. |
| frstor | FRSTOR m108byte | Legacy | x87 FPU | Loads FPU state from memory. |
| fsave | FSAVE m108byte | Legacy | x87 FPU | Stores FPU state to memory and re-initializes FPU. |
| fscale | FSCALE | Legacy | x87 FPU | Scales ST(0) by ST(1) (ST(0) * 2^ST(1)). |
| fsin | FSIN | Legacy | x87 FPU | Computes sine of ST(0) (in radians). |
| fsincos | FSINCOS | Legacy | x87 FPU | Computes sine and cosine of ST(0), pushing both to stack. |
| fsqrt | FSQRT | Legacy | x87 FPU | Computes square root of ST(0). |
| fst | FST m32fp/m64fp | Legacy | x87 FPU | Copies the value in ST(0) to memory or another register. |
| fstcw | FSTCW m2byte | Legacy | x87 FPU | Stores FPU control word to memory. |
| fstp | FSTP m32fp/m64fp/m80fp | Legacy | x87 FPU | Copies ST(0) to destination and pops the register stack. |
| fstsw | FSTSW AX | Legacy | x87 FPU | Stores FPU status word to AX or memory. |
| fsub | FSUB m32fp/m64fp | Legacy | x87 FPU | Subtracts src from dest. |
| fucom | FUCOM ST(i) | Legacy | x87 FPU | Compares ST(0) with source (supports NaNs). |
| fxch | FXCH ST(i) | Legacy | x87 FPU | Exchanges contents of ST(0) and ST(i). |
| fxtract | FXTRACT | Legacy | x87 FPU | Separates exponent and significand of ST(0). |
| fyl2x | FYL2X | Legacy | x87 FPU | Computes ST(1) * log2(ST(0)). |
| fyl2xp1 | FYL2XP1 | Legacy | x87 FPU | Computes ST(1) * log2(ST(0) + 1). |
| getsec | GETSEC | Legacy | SMX | Entry point for Safer Mode Extensions (Trusted Execution). |
| gf2p8affineinvqb | GF2P8AFFINEINVQB xmm1, xmm2/m128, imm8 | VEX | GFNI | Computes inverse affine transformation in GF(2^8). |
| gf2p8affineqb | GF2P8AFFINEQB xmm1, xmm2/m128, imm8 | VEX | GFNI | Computes affine transformation in GF(2^8). |
| gf2p8mulb | GF2P8MULB xmm1, xmm2/m128 | VEX | GFNI | Multiplies bytes in GF(2^8). |
| haddpd | HADDPD xmm1, xmm2/m128 | SSE3 | SSE3 | Adds adjacent double-precision elements horizontally. |
| haddps | HADDPS xmm1, xmm2/m128 | SSE3 | SSE3 | Adds adjacent float elements horizontally. |
| hlt | HLT | Legacy | Base | Stops instruction execution and places processor in HALT state. |
| hreset | HRESET imm8 | Legacy | HRESET | Resets processor history (prediction) structures. |
| hsubpd | HSUBPD xmm1, xmm2/m128 | SSE3 | SSE3 | Subtracts adjacent double-precision elements horizontally. |
| hsubps | HSUBPS xmm1, xmm2/m128 | SSE3 | SSE3 | Subtracts adjacent single-precision elements horizontally. |
| idiv | IDIV r/m | Legacy | Base | Signed divide (AX / src). |
| imul | IMUL r, r/m | Legacy | Base | Signed multiply. |
| in | IN AL, imm8 | Legacy | Base | Reads data from an I/O port into AL/AX/EAX. |
| in_var | IN AL/AX/EAX, DX | Legacy | Base | Reads data from I/O port specified in DX. |
| inc | INC r/m | Legacy | Base | Increments the operand by 1. |
| incsspq | INCSSPQ r64 | Legacy | CET-SS | Adjusts the shadow stack pointer. |
| ins | INSB | Legacy | Base | Reads string from I/O port to memory at [EDI]. |
| insd | INSD | Legacy | Base | Reads doubleword from I/O port to memory at [EDI]. |
| insertps | INSERTPS xmm1, xmm2/m32, imm8 | SSE4.1 | SSE4.1 | Inserts a single float into a specific index of XMM. |
| insertq | INSERTQ xmm1, xmm2 | SSE4a | SSE4a | Inserts bit field into register (AMD SSE4a). |
| insw | INSW | Legacy | Base | Reads word from I/O port to memory at [EDI]. |
| int | INT imm8 | Legacy | Base | Calls to interrupt procedure. |
| int1 | INT1 | Legacy | Base | Single byte opcode (0xF1) used for In-Circuit Emulation. |
| int3 | INT3 | Legacy | Base | Calls to interrupt vector 3 (Debugger breakpoint). |
| invd | INVD | System | System | Flushes internal caches without writing back data (Privileged). |
| invept | INVEPT r64, m128 | VMX | VMX (EPT) | Invalidates Extended Page Table entries. |
| invlpg | INVLPG m | System | System | Invalidates a specific TLB entry (Privileged). |
| invlpga | INVLPGA | SVM | SVM | Invalidates TLB entry for specific ASID (AMD SVM). |
| invpcid | INVPCID r32, m128 | Legacy | INVPCID | Invalidates TLB entries based on PCID. |
| invvpid | INVVPID r64, m128 | VMX | VMX (VPID) | Invalidates TLB entries based on Virtual Processor ID. |
| iret | IRET | Legacy | Base | Returns from an interrupt, exception, or task handler. |
| iretd | IRETD | Legacy | Base | Returns from interrupt (32-bit operand size). |
| iretq | IRETQ | Legacy | Base (64-bit) | Returns from interrupt (64-bit operand size). |
| ja | JA rel | Legacy | Base | Jump if CF=0 and ZF=0 (Unsigned >). |
| jb | JB rel | Legacy | Base | Jump if CF=1 (Unsigned <). |
| je | JE rel | Legacy | Base | Jump if ZF=1 (Same as JZ). |
| jecxz | JECXZ rel | Legacy | Base | Jumps if ECX register is 0. |
| jg | JG rel | Legacy | Base | Jump if ZF=0 and SF=OF (Signed >). |
| jl | JL rel | Legacy | Base | Jump if SF!=OF (Signed <). |
| jmp | JMP rel | Legacy | Base | Unconditional jump to target. |
| jmp | JMP rel | Legacy | Base | Unconditional jump to target. |
| jne | JNE rel | Legacy | Base | Jump if ZF=0 (Same as JNZ). |
| jno | JNO rel | Legacy | Base | Jump near if overflow flag is 0. |
| jnp | JNP rel | Legacy | Base | Jump near if parity flag is 0 (Odd parity). |
| jns | JNS rel | Legacy | Base | Jump near if sign flag is 0 (Positive). |
| jo | JO rel | Legacy | Base | Jump near if overflow flag is 1. |
| jp | JP rel | Legacy | Base | Jump near if parity flag is 1 (Even parity). |
| js | JS rel | Legacy | Base | Jump near if sign flag is 1 (Negative). |
| kaddb | KADDB k1, k2, k3 | EVEX | AVX-512DQ | Adds two 8-bit mask registers. |
| kaddw | KADDW k1, k2, k3 | EVEX | AVX-512DQ | Adds two 16-bit mask registers. |
| kandnw | KANDNW k1, k2, k3 | EVEX | AVX-512 | Bitwise AND NOT of 16-bit masks. |
| kandq | KANDQ k1, k2, k3 | EVEX | AVX-512BW | Bitwise AND of 64-bit mask registers. |
| kandw | KANDW k1, k2, k3 | EVEX | AVX-512 | Bitwise AND of 16-bit masks. |
| kmovq | KMOVQ k1, k2/m64 | EVEX | AVX-512BW | Moves 64-bit mask to/from k-register. |
| kmovw | KMOVW k1, k2/m16 | EVEX | AVX-512 | Moves 16-bit mask to/from k-register. |
| knotb | KNOTB k1, k2 | EVEX | AVX-512DQ | Bitwise NOT of 8-bit mask. |
| knotd | KNOTD k1, k2 | EVEX | AVX-512BW | Bitwise NOT of 32-bit mask. |
| knotq | KNOTQ k1, k2 | EVEX | AVX-512BW | Bitwise NOT of 64-bit mask register. |
| knotw | KNOTW k1, k2 | EVEX | AVX-512 | Bitwise NOT of 16-bit mask. |
| korb | KORB k1, k2, k3 | EVEX | AVX-512DQ | Bitwise OR of 8-bit masks. |
| kord | KORD k1, k2, k3 | EVEX | AVX-512BW | Bitwise OR of 32-bit masks. |
| korq | KORQ k1, k2, k3 | EVEX | AVX-512BW | Bitwise OR of 64-bit mask registers. |
| kortestb | KORTESTB k1, k2 | EVEX | AVX-512DQ | ORs 8-bit masks and sets EFLAGS (ZF/CF). |
| kortestq | KORTESTQ k1, k2 | EVEX | AVX-512BW | ORs 64-bit masks and sets EFLAGS (ZF/CF). |
| kortestw | KORTESTW k1, k2 | EVEX | AVX-512 | ORs two masks and sets EFLAGS (ZF, CF) based on result. |
| korw | KORW k1, k2, k3 | EVEX | AVX-512 | Bitwise OR of 16-bit masks. |
| kshiftlb | KSHIFTLB k1, k2, imm8 | EVEX | AVX-512DQ | Logically shifts 8-bit mask left. |
| kshiftld | KSHIFTLD k1, k2, imm8 | EVEX | AVX-512BW | Logically shifts 32-bit mask left. |
| kshiftlq | KSHIFTLQ k1, k2, imm8 | EVEX | AVX-512BW | Logically shifts 64-bit mask left. |
| kshiftlw | KSHIFTLW k1, k2, imm8 | EVEX | AVX-512F | Logically shifts 16-bit mask left. |
| kshiftrb | KSHIFTRB k1, k2, imm8 | EVEX | AVX-512DQ | Logically shifts 8-bit mask right. |
| kshiftrd | KSHIFTRD k1, k2, imm8 | EVEX | AVX-512BW | Logically shifts 32-bit mask right. |
| kshiftrq | KSHIFTRQ k1, k2, imm8 | EVEX | AVX-512BW | Logically shifts 64-bit mask right. |
| kshiftrw | KSHIFTRW k1, k2, imm8 | EVEX | AVX-512F | Logically shifts 16-bit mask right. |
| ktestb | KTESTB k1, k2 | EVEX | AVX-512DQ | ANDs 8-bit masks and sets EFLAGS (ZF/CF). |
| ktestd | KTESTD k1, k2 | EVEX | AVX-512BW | ANDs 32-bit masks and sets EFLAGS (ZF/CF). |
| ktestq | KTESTQ k1, k2 | EVEX | AVX-512BW | ANDs 64-bit masks and sets EFLAGS (ZF/CF). |
| ktestw | KTESTW k1, k2 | EVEX | AVX-512F | ANDs 16-bit masks and sets EFLAGS (ZF/CF). |
| kunpckbw | KUNPCKBW k1, k2, k3 | EVEX | AVX-512 | Interleaves 8-bit masks into 16-bit mask. |
| kunpckdq | KUNPCKDQ k1, k2, k3 | EVEX | AVX-512BW | Interleaves 32-bit masks into 64-bit mask. |
| kunpckwd | KUNPCKWD k1, k2, k3 | EVEX | AVX-512BW | Interleaves 16-bit masks into 32-bit mask. |
| kxorb | KXORB k1, k2, k3 | EVEX | AVX-512DQ | Bitwise XOR of 8-bit masks. |
| kxord | KXORD k1, k2, k3 | EVEX | AVX-512BW | Bitwise XOR of 32-bit masks. |
| kxorq | KXORQ k1, k2, k3 | EVEX | AVX-512BW | Bitwise XOR of 64-bit masks. |
| kxorw | KXORW k1, k2, k3 | EVEX | AVX-512 | Bitwise XOR of 16-bit masks. |
| lahf | LAHF | Legacy | Base | Loads bits 0, 2, 4, 6, and 7 of EFLAGS into AH. |
| lar | LAR r, r/m16 | System | System | Reads access rights from segment descriptor. |
| lddqu | LDDQU xmm1, m128 | SSE3 | SSE3 | Loads unaligned data avoiding split-line penalties. |
| ldmxcsr | LDMXCSR m32 | SSE | SSE | Loads the MXCSR control/status register from memory. |
| lds | LDS r, m | Legacy | Base (Legacy) | Loads pointer into DS and register. |
| ldtilecfg | LDTILECFG m512 | VEX | AMX-TILE | Loads AMX tile configuration from memory. |
| lea | LEA r, m | Legacy | Base | Computes effective address and stores in register. |
| leave | LEAVE | Legacy | Base | Releases stack frame (MOV ESP, EBP; POP EBP). |
| les | LES r, m | Legacy | Base (Legacy) | Loads pointer into ES and register. |
| lfence | LFENCE | SSE2 | SSE2 | Serializes load operations (Wait for prior loads to complete). |
| lfs | LFS r, m | Legacy | Base | Loads pointer into FS and register. |
| lgdt | LGDT m16&32 | System | System | Loads the GDT register (Privileged). |
| lgs | LGS r, m | Legacy | Base | Loads pointer into GS and register. |
| lidt | LIDT m16&32 | System | System | Loads the IDT register (Privileged). |
| lkgs | LKGS r16 | Legacy | LKGS | Loads the kernel GS base address (FRED support). |
| lldt | LLDT r/m16 | System | System | Loads LDT segment selector (Privileged). |
| lmsw | LMSW r/m16 | System | System | Loads Machine Status Word (Legacy CR0 modification). |
| loadiwkey | LOADIWKEY xmm1, xmm2 | Legacy | KEYLOCKER | Loads the Key Locker internal wrapping key. |
| lods | LODSB | Legacy | Base | Loads byte/word/dword from [ESI] into AL/AX/EAX. |
| lodsd | LODSD | Legacy | Base | Loads doubleword from [ESI] into EAX. |
| lodsq | LODSQ | Legacy | Base (64-bit) | Loads quadword from [RSI] into RAX. |
| lodsw | LODSW | Legacy | Base | Loads word from [ESI] into AX. |
| loop | LOOP rel | Legacy | Base | Decrements ECX/RCX and jumps if not zero. |
| loope | LOOPE rel | Legacy | Base | Decrements count; jumps if count!=0 and ZF=1. |
| loopne | LOOPNE rel | Legacy | Base | Decrements count; jumps if count!=0 and ZF=0. |
| lsl | LSL r, r/m16 | System | System | Reads segment limit from descriptor. |
| lss | LSS r, m | Legacy | Base | Loads pointer into SS and register. |
| ltr | LTR r/m16 | System | System | Loads Task Register (Privileged). |
| lzcnt | LZCNT r, r/m | VEX | ABM/BMI | Counts number of leading zeros. |
| maskmovdqu | MASKMOVDQU xmm, xmm | SSE2 | SSE2 | Non-temporal store of selected bytes (masked). |
| maskmovq | MASKMOVQ mm1, mm2 | MMX | MMX | Non-temporal store of selected MMX bytes. |
| maxps | MAXPS xmm, xmm/m128 | SSE | SSE | Returns maximum of packed floats. |
| maxsd | MAXSD xmm1, xmm2/m64 | SSE2 | SSE2 | Returns the maximum of two low double-precision values. |
| maxss | MAXSS xmm1, xmm2/m32 | SSE | SSE | Returns the maximum of two low single-precision values. |
| mfence | MFENCE | SSE2 | SSE2 | Serializes all load and store operations. |
| minps | MINPS xmm, xmm/m128 | SSE | SSE | Returns minimum of packed floats. |
| minsd | MINSD xmm1, xmm2/m64 | SSE2 | SSE2 | Returns the minimum of two low double-precision values. |
| minss | MINSS xmm1, xmm2/m32 | SSE | SSE | Returns the minimum of two low single-precision values. |
| monitor | MONITOR | Legacy | SSE3 | Sets up a linear address range to be monitored. |
| monitorx | MONITORX | AMD | AMD | Sets up a monitor address (AMD extension). |
| mov | MOV r/m, r | Legacy | Base | Copies data from source to destination. |
| mov | MOV r/m, r | Legacy | Base | Copies data from source to destination. |
| mov cr | MOV CRn, r | System | System | Moves data to/from Control Registers (CR0, CR3, etc.) (Privileged). |
| mov dr | MOV DRn, r | System | System | Moves data to/from Debug Registers (DR0-DR7) (Privileged). |
| movapd | MOVAPD xmm, xmm/m128 | SSE2 | SSE2 | Moves 128-bit packed double data (Must be 16-byte aligned). |
| movaps | MOVAPS xmm, xmm/m128 | SSE | SSE | Moves 128-bit packed float data (Must be 16-byte aligned). |
| movbe | MOVBE r, m | Legacy | MOVBE | Moves data swapping bytes (Big Endian load/store). |
| movd | MOVD mm/xmm, r32/m32 | SSE | MMX/SSE2 | Moves 32 bits between GPR and XMM/MMX register. |
| movddup | MOVDDUP xmm1, xmm2/m64 | SSE3 | SSE3 | Loads 64-bit double and duplicates it to fill 128-bit register. |
| movdir64b | MOVDIR64B m512, m512 | Legacy | MOVDIR64B | Atomically moves 64-byte block avoiding cache pollution. |
| movdiri | MOVDIRI m, r | Legacy | MOVDIRI | Moves 32/64-bit data avoiding cache pollution (Direct IO). |
| movdqa | MOVDQA xmm, xmm/m128 | SSE2 | SSE2 | Moves 128-bit integer data (Aligned). |
| movdqu | MOVDQU xmm, xmm/m128 | SSE2 | SSE2 | Moves 128-bit integer data (Unaligned). |
| movmskpd | MOVMSKPD r32, xmm | SSE2 | SSE2 | Extracts sign bits from two doubles into low 2 bits of register. |
| movmskps | MOVMSKPS r32, xmm | SSE | SSE | Extracts sign bits from four floats into low 4 bits of register. |
| movntdqa | MOVNTDQA xmm1, m128 | SSE4.1 | SSE4.1 | Efficiently loads 128-bits from WC memory (Streaming Load). |
| movnti | MOVNTI m32, r32 | SSE2 | SSE2 | Stores integer register to memory bypassing cache. |
| movntpd | MOVNTPD m128, xmm | SSE2 | SSE2 | Stores double vectors directly to RAM, bypassing cache. |
| movntps | MOVNTPS m128, xmm | SSE | SSE | Stores float vectors directly to RAM, bypassing cache. |
| movntq | MOVNTQ m64, mm | SSE | SSE | Stores 64-bit MMX data bypassing cache. |
| movntsd | MOVNTSD m64, xmm1 | SSE4a | SSE4a | Stores scalar double bypassing cache (AMD SSE4a). |
| movntss | MOVNTSS m32, xmm1 | SSE4a | SSE4a | Stores scalar float bypassing cache (AMD SSE4a). |
| movq | MOVQ mm, mm/m64 | MMX | MMX | Moves 64-bit data between MMX registers/memory. |
| movq | MOVQ xmm, xmm/m64 | SSE2 | SSE2 | Moves 64 bits between XMM registers or memory. |
| movsd | MOVSD xmm1, xmm2/m64 | SSE2 | SSE2 | Moves a single double (low 64 bits) between XMM/Memory. |
| movsd | MOVSD | Legacy | Base | Moves doubleword from [ESI] to [EDI]. |
| movshdup | MOVSHDUP xmm1, xmm2/m128 | SSE3 | SSE3 | Duplicates high element of each qword pair. |
| movsldup | MOVSLDUP xmm1, xmm2/m128 | SSE3 | SSE3 | Duplicates low element of each qword pair. |
| movsq | MOVSQ | Legacy | Base (64-bit) | Moves quadword from [RSI] to [RDI]. |
| movss | MOVSS xmm1, xmm2/m32 | SSE | SSE | Moves a single float (low 32 bits) between XMM/Memory. |
| movsw | MOVSW | Legacy | Base | Moves word from [ESI] to [EDI]. |
| movsx | MOVSX r, r/m | Legacy | Base | Copies and sign-extends a smaller value to a larger register. |
| movsxd | MOVSXD r64, r/m32 | Base (64-bit) | Base (64-bit) | Sign-extends 32-bit register to 64-bit. |
| movupd | MOVUPD xmm, xmm/m128 | SSE2 | SSE2 | Moves 128-bit packed double data (Unaligned). |
| movups | MOVUPS xmm, xmm/m128 | SSE | SSE | Moves 128-bit packed float data (Unaligned). |
| movzx | MOVZX r, r/m | Legacy | Base | Copies and zero-extends a smaller value to a larger register. |
| mpsadbw | MPSADBW xmm1, xmm2/m128, imm8 | SSE4.1 | SSE4.1 | Computes multiple SADs of byte blocks. |
| mul | MUL r/m | Legacy | Base | Unsigned multiply (AX = AL * src). |
| mulpd | MULPD xmm, xmm/m128 | SSE2 | SSE2 | Multiplies two 64-bit doubles. |
| mulps | MULPS xmm, xmm/m128 | SSE | SSE | Multiplies four 32-bit floats. |
| mulsd | MULSD xmm1, xmm2/m64 | SSE2 | SSE2 | Multiplies the low double-precision floating-point value. |
| mulss | MULSS xmm1, xmm2/m32 | SSE | SSE | Multiplies the low single-precision floating-point value. |
| mulx | MULX r32, r32, r/m32 | VEX | BMI2 | Unsigned multiply of RDX * Src. Result in Hi:Lo. No flags. |
| mwait | MWAIT | Legacy | SSE3 | Waits for a write to a monitored address. |
| mwaitx | MWAITX | AMD | AMD | Waits for a write to monitored address (AMD extension). |
| neg | NEG r/m | Legacy | Base | Negates value (0 - operand). |
| nop | NOP | Legacy | Base | Does nothing (alias for XCHG EAX, EAX). |
| not | NOT r/m | Legacy | Base | Reverses bits of operand. |
| or | OR r/m, r | Legacy | Base | Performs bitwise OR. |
| or | OR r/m, r | Legacy | Base | Performs bitwise OR. |
| orps | ORPS xmm, xmm/m128 | SSE | SSE | Bitwise OR of 128 bits. |
| out | OUT imm8, AL | Legacy | Base | Writes data from AL/AX/EAX to an I/O port. |
| out_var | OUT DX, AL/AX/EAX | Legacy | Base | Writes data to I/O port specified in DX. |
| outs | OUTSB | Legacy | Base | Writes string from memory at [ESI] to I/O port. |
| outsd | OUTSD | Legacy | Base | Writes doubleword from memory at [ESI] to I/O port. |
| outsw | OUTSW | Legacy | Base | Writes word from memory at [ESI] to I/O port. |
| pabsb | PABSB xmm1, xmm2/m128 | SSSE3 | SSSE3 | Computes absolute value of bytes. |
| packssdw | PACKSSDW xmm, xmm/m128 | SSE2 | SSE2 | Converts doublewords to words with saturation. |
| packsswb | PACKSSWB xmm, xmm/m128 | SSE2 | SSE2 | Converts words to bytes with saturation. |
| packusdw | PACKUSDW xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Converts signed dwords to unsigned words with saturation. |
| packuswb | PACKUSWB xmm1, xmm2/m128 | SSE2 | SSE2 | Converts signed words to unsigned bytes with saturation. |
| paddb | PADDB xmm, xmm/m128 | SSE2 | SSE2 | Adds 16 bytes (Wraparound). |
| paddd | PADDD xmm, xmm/m128 | SSE2 | SSE2 | Adds 4 doublewords (Wraparound). |
| paddq | PADDQ xmm, xmm/m128 | SSE2 | SSE2 | Adds 2 quadwords (Wraparound). |
| paddsb | PADDSB mm, mm/m64 | MMX | MMX | Adds 8 signed bytes with saturation (MMX). |
| paddsb | PADDSB xmm, xmm/m128 | SSE2 | SSE2 | Adds 16 signed bytes with saturation. |
| paddsw | PADDSW mm, mm/m64 | MMX | MMX | Adds 4 signed words with saturation (MMX). |
| paddsw | PADDSW xmm1, xmm2/m128 | SSE2 | SSE2 | Adds 16-bit words with signed saturation. |
| paddusb | PADDUSB mm, mm/m64 | MMX | MMX | Adds 8 unsigned bytes with saturation (MMX). |
| paddusb | PADDUSB xmm, xmm/m128 | SSE2 | SSE2 | Adds 16 unsigned bytes with saturation. |
| paddusw | PADDUSW mm, mm/m64 | MMX | MMX | Adds 4 unsigned words with saturation (MMX). |
| paddusw | PADDUSW xmm1, xmm2/m128 | SSE2 | SSE2 | Adds 16-bit words with unsigned saturation. |
| paddw | PADDW xmm, xmm/m128 | SSE2 | SSE2 | Adds 8 words (Wraparound). |
| palignr | PALIGNR xmm1, xmm2/m128, imm8 | SSSE3 | SSSE3 | Concatenates dest and src, extracts 128 bits byte-aligned. |
| pand | PAND mm, mm/m64 | MMX | MMX | Bitwise AND of 64-bit MMX registers. |
| pand | PAND xmm, xmm/m128 | SSE2 | SSE2 | Bitwise AND of 128-bit integers. |
| pandn | PANDN mm, mm/m64 | MMX | MMX | Bitwise AND NOT of 64-bit MMX registers. |
| pause | PAUSE | Legacy | Base | Improves performance of spin-wait loops (alias for REP NOP). |
| pavgb | PAVGB xmm1, xmm2/m128 | SSE2 | SSE2 | Averages packed unsigned bytes (rounded up). |
| pavgw | PAVGW xmm1, xmm2/m128 | SSE2 | SSE2 | Averages packed unsigned words (rounded up). |
| pblendvb | PBLENDVB xmm1, xmm2/m128, <XMM0> | SSE4.1 | SSE4.1 | Blends bytes based on variable mask in XMM0. |
| pblendw | PBLENDW xmm1, xmm2/m128, imm8 | SSE4.1 | SSE4.1 | Selects words from two sources based on immediate mask. |
| pclmulqdq | PCLMULQDQ xmm1, xmm2/m128, imm8 | PCLMUL | PCLMULQDQ | Performs carry-less multiplication (Galois Field math for AES-GCM). |
| pcmpeqb | PCMPEQB mm, mm/m64 | MMX | MMX | Compares bytes for equality (MMX). |
| pcmpeqb | PCMPEQB xmm, xmm/m128 | SSE2 | SSE2 | Compares bytes for equality (Result mask 0xFF or 0x00). |
| pcmpeqd | PCMPEQD mm, mm/m64 | MMX | MMX | Compares doublewords for equality (MMX). |
| pcmpeqd | PCMPEQD xmm, xmm/m128 | SSE2 | SSE2 | Compares doublewords for equality. |
| pcmpeqq | PCMPEQQ xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Checks if 64-bit integer elements are equal. |
| pcmpeqw | PCMPEQW mm, mm/m64 | MMX | MMX | Compares words for equality (MMX). |
| pcmpeqw | PCMPEQW xmm, xmm/m128 | SSE2 | SSE2 | Compares words for equality. |
| pcmpestri | PCMPESTRI xmm1, xmm2/m128, imm8 | SSE4.2 | SSE4.2 | Complex string search/compare; returns index in ECX. |
| pcmpestrm | PCMPESTRM xmm1, xmm2/m128, imm8 | SSE4.2 | SSE4.2 | Complex string search/compare; returns mask in XMM0. |
| pcmpgtb | PCMPGTB mm, mm/m64 | MMX | MMX | Compares bytes for greater than (MMX). |
| pcmpgtb | PCMPGTB xmm1, xmm2/m128 | SSE2 | SSE2 | Compares bytes for greater than (signed). |
| pcmpgtd | PCMPGTD mm, mm/m64 | MMX | MMX | Compares doublewords for greater than (MMX). |
| pcmpgtd | PCMPGTD xmm1, xmm2/m128 | SSE2 | SSE2 | Compares doublewords for greater than (signed). |
| pcmpgtq | PCMPGTQ xmm1, xmm2/m128 | SSE4.2 | SSE4.2 | Compares quadwords for greater than (signed). |
| pcmpgtw | PCMPGTW mm, mm/m64 | MMX | MMX | Compares words for greater than (MMX). |
| pcmpgtw | PCMPGTW xmm1, xmm2/m128 | SSE2 | SSE2 | Compares words for greater than (signed). |
| pcmpistri | PCMPISTRI xmm1, xmm2/m128, imm8 | SSE4.2 | SSE4.2 | String search (null-terminated); returns index in ECX. |
| pconfig | PCONFIG | Legacy | PCONFIG | Configures platform features like MKTME (Memory Encryption). |
| pdep | PDEP r32, r32, r/m32 | VEX | BMI2 | Scatters bits from LSB of source to positions marked in mask. |
| pext | PEXT r32, r32, r/m32 | VEX | BMI2 | Extracts bits from source using mask and packs them to LSB. |
| pextrb | PEXTRB r32/m8, xmm1, imm8 | SSE4.1 | SSE4.1 | Extracts a byte from XMM to integer register. |
| pextrd | PEXTRD r32/m32, xmm1, imm8 | SSE4.1 | SSE4.1 | Extracts a doubleword from XMM to register. |
| pextrq | PEXTRQ r64/m64, xmm1, imm8 | SSE4.1 | SSE4.1 | Extracts a quadword from XMM to register. |
| pextrw | PEXTRW r32, xmm1, imm8 | SSE | SSE | Extracts a word from XMM to integer register. |
| pfadd | PFADD mm, mm/m64 | 3DNow! | 3DNow! | Adds two packed floats (3DNow!). |
| pfmul | PFMUL mm, mm/m64 | 3DNow! | 3DNow! | Multiplies packed floats (3DNow!). |
| pfrcp | PFRCP mm, mm/m64 | 3DNow! | 3DNow! | Approximates reciprocal (3DNow!). |
| pfrsqrt | PFRSQRT mm, mm/m64 | 3DNow! | 3DNow! | Approximates reciprocal sqrt (3DNow!). |
| pfsub | PFSUB mm, mm/m64 | 3DNow! | 3DNow! | Subtracts packed floats (3DNow!). |
| phaddw | PHADDW xmm1, xmm2/m128 | SSSE3 | SSSE3 | Adds adjacent 16-bit integers horizontally. |
| phminposuw | PHMINPOSUW xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Finds minimum word and its index. |
| phsubd | PHSUBD xmm1, xmm2/m128 | SSSE3 | SSSE3 | Subtracts adjacent 32-bit integers horizontally. |
| phsubw | PHSUBW xmm1, xmm2/m128 | SSSE3 | SSSE3 | Subtracts adjacent 16-bit integers horizontally. |
| pinsrb | PINSRB xmm1, r32/m8, imm8 | SSE4.1 | SSE4.1 | Inserts a byte from integer register into XMM. |
| pinsrd | PINSRD xmm1, r32/m32, imm8 | SSE4.1 | SSE4.1 | Inserts a doubleword from register to XMM. |
| pinsrq | PINSRQ xmm1, r64/m64, imm8 | SSE4.1 | SSE4.1 | Inserts a quadword from register to XMM. |
| pinsrw | PINSRW xmm1, r32/m16, imm8 | SSE | SSE | Inserts a word from integer register into XMM. |
| pmaddubsw | PMADDUBSW xmm1, xmm2/m128 | SSSE3 | SSSE3 | Multiplies signed/unsigned bytes and adds pairs to words. |
| pmaddwd | PMADDWD mm, mm/m64 | MMX | MMX | Multiplies words and adds adjacent pairs (MMX). |
| pmaddwd | PMADDWD xmm1, xmm2/m128 | SSE2 | SSE2 | Multiplies words, adds adjacent pairs to doublewords. |
| pmaxsb | PMAXSB xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Returns maximum of signed bytes. |
| pmaxsd | PMAXSD xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Returns maximum of signed doublewords. |
| pmaxsw | PMAXSW xmm1, xmm2/m128 | SSE2 | SSE2 | Returns maximum of signed words. |
| pmaxub | PMAXUB xmm1, xmm2/m128 | SSE2 | SSE2 | Returns maximum of unsigned bytes. |
| pmaxud | PMAXUD xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Returns maximum of unsigned doublewords. |
| pmaxuw | PMAXUW xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Returns maximum of unsigned words. |
| pminsb | PMINSB xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Returns minimum of signed bytes. |
| pminsd | PMINSD xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Returns minimum of signed doublewords. |
| pminsw | PMINSW xmm1, xmm2/m128 | SSE2 | SSE2 | Returns minimum of signed words. |
| pminub | PMINUB xmm1, xmm2/m128 | SSE2 | SSE2 | Returns minimum of unsigned bytes. |
| pminud | PMINUD xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Returns minimum of unsigned doublewords. |
| pminuw | PMINUW xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Returns minimum of unsigned words. |
| pmovmskb | PMOVMSKB r32, xmm | SSE2 | SSE2 | Creates a mask from the MSB of each byte in XMM. |
| pmovsxbq | PMOVSXBQ xmm1, xmm2/m16 | SSE4.1 | SSE4.1 | Sign extends 8-bit integers to 64-bit. |
| pmovsxbw | PMOVSXBW xmm1, xmm2/m64 | SSE4.1 | SSE4.1 | Sign extends 8-bit integers to 16-bit. |
| pmovsxdq | PMOVSXDQ xmm1, xmm2/m64 | SSE4.1 | SSE4.1 | Sign extends 32-bit integers to 64-bit. |
| pmovsxwd | PMOVSXWD xmm1, xmm2/m64 | SSE4.1 | SSE4.1 | Sign extends 16-bit integers to 32-bit. |
| pmovsxwq | PMOVSXWQ xmm1, xmm2/m32 | SSE4.1 | SSE4.1 | Sign extends 16-bit integers to 64-bit. |
| pmovzxbd | PMOVZXBD xmm1, xmm2/m32 | SSE4.1 | SSE4.1 | Zero extends 8-bit integers to 32-bit. |
| pmovzxbq | PMOVZXBQ xmm1, xmm2/m16 | SSE4.1 | SSE4.1 | Zero extends 8-bit integers to 64-bit. |
| pmovzxbw | PMOVZXBW xmm1, xmm2/m64 | SSE4.1 | SSE4.1 | Zero extends 8-bit integers to 16-bit. |
| pmovzxdq | PMOVZXDQ xmm1, xmm2/m64 | SSE4.1 | SSE4.1 | Zero extends 32-bit integers to 64-bit. |
| pmovzxwd | PMOVZXWD xmm1, xmm2/m64 | SSE4.1 | SSE4.1 | Zero extends 16-bit integers to 32-bit. |
| pmovzxwq | PMOVZXWQ xmm1, xmm2/m32 | SSE4.1 | SSE4.1 | Zero extends 16-bit integers to 64-bit. |
| pmulhrsw | PMULHRSW xmm1, xmm2/m128 | SSSE3 | SSSE3 | Multiplies signed 16-bit words, rounds, and scales. |
| pmulhuw | PMULHUW xmm1, xmm2/m128 | SSE2 | SSE2 | Multiplies unsigned words, keeps high 16 bits. |
| pmulhw | PMULHW mm, mm/m64 | MMX | MMX | Multiplies 4 signed words and stores high 16 bits (MMX). |
| pmulhw | PMULHW xmm1, xmm2/m128 | SSE2 | SSE2 | Multiplies signed words, keeps high 16 bits. |
| pmulld | PMULLD xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Multiplies 32-bit integers, stores low 32-bit result. |
| pmullw | PMULLW mm, mm/m64 | MMX | MMX | Multiplies 4 words and stores low 16 bits (MMX). |
| pmullw | PMULLW xmm1, xmm2/m128 | SSE2 | SSE2 | Multiplies 16-bit words and stores low 16-bit result. |
| pmuludq | PMULUDQ xmm1, xmm2/m128 | SSE2 | SSE2 | Multiplies low 32-bits of each 64-bit chunk to 64-bit result. |
| pop | POP r/m | Legacy | Base | Loads operand from stack and increments SP. |
| popa | POPA | Legacy | Base (32-bit only) | Pops into DI, SI, BP, SP, BX, DX, CX, AX (Invalid in 64-bit). |
| popcnt | POPCNT r, r/m | VEX | SSE4.2 | Counts number of bits set to 1. |
| popf | POPF | Legacy | Base | Pops stack into EFLAGS. |
| por | POR mm, mm/m64 | MMX | MMX | Bitwise OR of 64-bit MMX registers. |
| por | POR xmm, xmm/m128 | SSE2 | SSE2 | Bitwise OR of 128-bit integers. |
| prefetchnta | PREFETCHNTA m8 | SSE | SSE | Prefetches data to non-temporal cache structure (minimize pollution). |
| prefetcht0 | PREFETCHT0 m8 | SSE | SSE | Prefetches data to L1 cache. |
| prefetcht1 | PREFETCHT1 m8 | SSE | SSE | Hints to fetch data to L2 and L3 caches. |
| prefetcht2 | PREFETCHT2 m8 | SSE | SSE | Hints to fetch data to L3 cache only. |
| prefetchw | PREFETCHW m8 | Legacy | PREFETCHW | Prefetches data with intent to write (RFO). |
| prefetchwt1 | PREFETCHWT1 m8 | Legacy | PREFETCHWT1 | Prefetches data to L2 (T1 hint) with intent to write. |
| psadbw | PSADBW xmm1, xmm2/m128 | SSE2 | SSE2 | Computes absolute differences of bytes and sums them to words. |
| pshufb | PSHUFB xmm1, xmm2/m128 | SSSE3 | SSSE3 | Shuffles bytes according to indices in source operand. |
| pshufd | PSHUFD xmm, xmm/m128, imm8 | SSE2 | SSE2 | Shuffles 32-bit integers. |
| pshufhw | PSHUFHW xmm1, xmm2/m128, imm8 | SSE2 | SSE2 | Shuffles the high 4 words of XMM. |
| pshuflw | PSHUFLW xmm1, xmm2/m128, imm8 | SSE2 | SSE2 | Shuffles the low 4 words of XMM. |
| psignb | PSIGNB xmm1, xmm2/m128 | SSSE3 | SSSE3 | Negates/Zeroes bytes in dest based on sign of src. |
| pslld | PSLLD mm, imm8 | MMX | MMX | Shifts doublewords left (MMX). |
| pslld | PSLLD xmm, imm8 | SSE2 | SSE2 | Shifts doublewords left. |
| pslldq | PSLLDQ xmm1, imm8 | SSE2 | SSE2 | Shifts the entire 128-bit register left by bytes. |
| psllq | PSLLQ mm, imm8 | MMX | MMX | Shifts quadword left (MMX). |
| psllw | PSLLW mm, imm8 | MMX | MMX | Shifts words left (MMX). |
| psllw | PSLLW xmm, imm8 | SSE2 | SSE2 | Shifts words left. |
| psrad | PSRAD mm, imm8 | MMX | MMX | Shifts doublewords right arithmetic (MMX). |
| psrad | PSRAD xmm, imm8 | SSE2 | SSE2 | Shifts doublewords right arithmetic. |
| psraw | PSRAW mm, imm8 | MMX | MMX | Shifts words right arithmetic (MMX). |
| psraw | PSRAW xmm, imm8 | SSE2 | SSE2 | Shifts words right arithmetic (sign bit). |
| psrld | PSRLD mm, imm8 | MMX | MMX | Shifts doublewords right logical (MMX). |
| psrld | PSRLD xmm, imm8 | SSE2 | SSE2 | Shifts doublewords right logical. |
| psrldq | PSRLDQ xmm1, imm8 | SSE2 | SSE2 | Shifts the entire 128-bit register right by bytes. |
| psrlq | PSRLQ mm, imm8 | MMX | MMX | Shifts quadword right logical (MMX). |
| psrlw | PSRLW mm, imm8 | MMX | MMX | Shifts words right logical (MMX). |
| psrlw | PSRLW xmm, imm8 | SSE2 | SSE2 | Shifts words right logical. |
| psubb | PSUBB xmm, xmm/m128 | SSE2 | SSE2 | Subtracts 16 bytes. |
| psubd | PSUBD xmm, xmm/m128 | SSE2 | SSE2 | Subtracts 4 doublewords. |
| psubq | PSUBQ xmm1, xmm2/m128 | SSE2 | SSE2 | Subtracts packed quadwords. |
| psubsb | PSUBSB mm, mm/m64 | MMX | MMX | Subtracts 8 signed bytes with saturation (MMX). |
| psubsw | PSUBSW mm, mm/m64 | MMX | MMX | Subtracts 4 signed words with saturation (MMX). |
| psubsw | PSUBSW xmm1, xmm2/m128 | SSE2 | SSE2 | Subtracts 16-bit words with signed saturation. |
| psubusb | PSUBUSB mm, mm/m64 | MMX | MMX | Subtracts 8 unsigned bytes with saturation (MMX). |
| psubusw | PSUBUSW mm, mm/m64 | MMX | MMX | Subtracts 4 unsigned words with saturation (MMX). |
| psubusw | PSUBUSW xmm1, xmm2/m128 | SSE2 | SSE2 | Subtracts 16-bit words with unsigned saturation. |
| psubw | PSUBW xmm, xmm/m128 | SSE2 | SSE2 | Subtracts 8 words. |
| ptest | PTEST xmm1, xmm2/m128 | SSE4.1 | SSE4.1 | Bitwise compare of 128-bit value (AND) setting flags. |
| ptwrite | PTWRITE r32/r64 | Legacy | PTWRITE | Writes data to the Intel Processor Trace stream. |
| punpckhbw | PUNPCKHBW xmm1, xmm2/m128 | SSE2 | SSE2 | Interleaves high bytes from two sources. |
| punpckhdq | PUNPCKHDQ xmm1, xmm2/m128 | SSE2 | SSE2 | Interleaves high doublewords. |
| punpckhqdq | PUNPCKHQDQ xmm1, xmm2/m128 | SSE2 | SSE2 | Interleaves high quadwords. |
| punpckhwd | PUNPCKHWD xmm1, xmm2/m128 | SSE2 | SSE2 | Interleaves high words. |
| punpcklbw | PUNPCKLBW xmm, xmm/m128 | SSE2 | SSE2 | Interleaves low bytes from two sources. |
| punpckldq | PUNPCKLDQ xmm, xmm/m128 | SSE2 | SSE2 | Interleaves low doublewords. |
| punpcklqdq | PUNPCKLQDQ xmm, xmm/m128 | SSE2 | SSE2 | Interleaves low quadwords. |
| punpcklwd | PUNPCKLWD xmm, xmm/m128 | SSE2 | SSE2 | Interleaves low words. |
| push | PUSH r/m | Legacy | Base | Decrements SP and stores operand on stack. |
| pusha | PUSHA | Legacy | Base (32-bit only) | Pushes AX, CX, DX, BX, SP, BP, SI, DI (Invalid in 64-bit). |
| pushf | PUSHF | Legacy | Base | Pushes EFLAGS onto stack. |
| pxor | PXOR mm, mm/m64 | MMX | MMX | Bitwise XOR of 64-bit MMX registers. |
| pxor | PXOR xmm, xmm/m128 | SSE2 | SSE2 | Bitwise XOR of 128-bit integers. |
| rcl | RCL r/m, imm8 | Legacy | Base | Rotates bits left through Carry Flag. |
| rcpps | RCPPS xmm, xmm/m128 | SSE | SSE | Approximate reciprocal (1/x) of four 32-bit floats. |
| rcpss | RCPSS xmm1, xmm2/m32 | SSE | SSE | Computes approximate reciprocal (1/x) of low float. |
| rcr | RCR r/m, imm8 | Legacy | Base | Rotates bits right through Carry Flag. |
| rdfsbase | RDFSBASE r64 | Legacy | FSGSBASE | Reads the FS base address into a register. |
| rdgsbase | RDGSBASE r64 | Legacy | FSGSBASE | Reads the GS base address into a register. |
| rdmsr | RDMSR | System | System | Reads MSR specified by ECX into EDX:EAX (Privileged). |
| rdpid | RDPID r32 | Legacy | RDPID | Reads the processor ID (TSC_AUX) into register. |
| rdpkru | RDPKRU | Legacy | PKU | Reads PKRU register into EAX (User-mode pages). |
| rdpmc | RDPMC | System | System | Reads performance counter specified by ECX into EDX:EAX. |
| rdrand | RDRAND r32 | Legacy | RDRAND | Retrieves a hardware-generated random number. |
| rdrand | RDRAND r16/r32/r64 | Legacy | RDRAND | Retrieves a hardware-generated random number. |
| rdseed | RDSEED r32 | Legacy | RDSEED | Retrieves a random seed from hardware entropy source. |
| rdseed | RDSEED r16/r32/r64 | Legacy | RDSEED | Retrieves a random seed from hardware entropy source. |
| rdsspq | RDSSPQ r64 | Legacy | CET-SS | Reads the current shadow stack pointer into a register. |
| rdtsc | RDTSC | Legacy | Base | Reads the time-stamp counter into EDX:EAX. |
| rdtscp | RDTSCP | Legacy | Base | Reads TSC into EDX:EAX and Processor ID into ECX. |
| rep movs | REP MOVS m, m | Legacy | Base | Moves ECX bytes/words from [ESI] to [EDI]. |
| rep stos | REP STOS m | Legacy | Base | Fills [EDI] with AL/AX/EAX for ECX repeats. |
| repe cmps | REPE CMPS m, m | Legacy | Base | Compares [ESI] and [EDI] until mismatch or ECX=0. |
| repne scas | REPNE SCAS m | Legacy | Base | Scans [EDI] for AL/AX/EAX until match or ECX=0. |
| ret | RET | Legacy | Base | Pop EIP/RIP and resume execution. |
| rol | ROL r/m, imm8 | Legacy | Base | Rotates bits left. |
| ror | ROR r/m, imm8 | Legacy | Base | Rotates bits right. |
| rorx | RORX r32, r/m32, imm8 | VEX | BMI2 | Rotate right with immediate. No flags update. |
| roundpd | ROUNDPD xmm1, xmm2/m128, imm8 | SSE4.1 | SSE4.1 | Rounds all packed doubles according to immediate mode. |
| roundps | ROUNDPS xmm1, xmm2/m128, imm8 | SSE4.1 | SSE4.1 | Rounds all packed floats according to immediate mode. |
| roundsd | ROUNDSD xmm1, xmm2/m64, imm8 | SSE4.1 | SSE4.1 | Rounds low double according to immediate mode. |
| roundss | ROUNDSS xmm1, xmm2/m32, imm8 | SSE4.1 | SSE4.1 | Rounds low float according to immediate mode. |
| rsm | RSM | System | System (SMM) | Exits SMM and returns to previous state (Privileged). |
| rsqrtps | RSQRTPS xmm, xmm/m128 | SSE | SSE | Approximate reciprocal sqrt (1/sqrt(x)) of four 32-bit floats. |
| rsqrtss | RSQRTSS xmm1, xmm2/m32 | SSE | SSE | Computes approximate reciprocal sqrt (1/sqrt(x)) of low float. |
| rstorssp | RSTORSSP m64 | Legacy | CET-SS | Restores SSP from memory token. |
| sahf | SAHF | Legacy | Base | Loads SF, ZF, AF, PF, and CF from AH. |
| sal | SAL r/m, imm8 | Legacy | Base | Shifts bits left (Alias for SHL). |
| sar | SAR r/m, imm8 | Legacy | Base | Shifts bits right, preserving sign bit. |
| sarx | SARX r32, r/m32, r32 | VEX | BMI2 | Arithmetic right shift, count in register. No flags update. |
| saveprevssp | SAVEPREVSSP | Legacy | CET-SS | Saves the previous SSP to the shadow stack token. |
| sbb | SBB r/m, r | Legacy | Base | Subtracts operands and the Carry Flag (CF). |
| scas | SCASB | Legacy | Base | Compares AL/AX/EAX with memory at [EDI]. |
| scasd | SCASD | Legacy | Base | Compares EAX with memory at [EDI]. |
| scasq | SCASQ | Legacy | Base (64-bit) | Compares RAX with memory at [RDI]. |
| scasw | SCASW | Legacy | Base | Compares AX with memory at [EDI]. |
| senduipi | SENDUIPI r64 | Legacy | UINTR | Sends a User IPI to another processor. |
| serialize | SERIALIZE | Legacy | SERIALIZE | Forces serialization of instruction fetch/execution. |
| seta | SETA r/m8 | Legacy | Base | Sets byte to 1 if CF=0 and ZF=0. |
| setae | SETAE r/m8 | Legacy | Base | Sets byte to 1 if CF=0. |
| setb | SETB r/m8 | Legacy | Base | Sets byte to 1 if CF=1. |
| setbe | SETBE r/m8 | Legacy | Base | Sets byte to 1 if CF=1 or ZF=1. |
| setcc | SETcc r/m8 | Legacy | Base | Sets byte to 1 if condition met, else 0 (e.g., SETE, SETZ). |
| setg | SETG r/m8 | Legacy | Base | Sets byte to 1 if ZF=0 and SF=OF. |
| setge | SETGE r/m8 | Legacy | Base | Sets byte to 1 if SF=OF. |
| setl | SETL r/m8 | Legacy | Base | Sets byte to 1 if SF!=OF. |
| setle | SETLE r/m8 | Legacy | Base | Sets byte to 1 if ZF=1 or SF!=OF. |
| setno | SETNO r/m8 | Legacy | Base | Sets byte to 1 if OF=0. |
| setnp | SETNP r/m8 | Legacy | Base | Sets byte to 1 if PF=0 (Odd Parity). |
| setns | SETNS r/m8 | Legacy | Base | Sets byte to 1 if SF=0 (Positive). |
| setnz | SETNZ r/m8 | Legacy | Base | Sets byte to 1 if ZF=0. |
| seto | SETO r/m8 | Legacy | Base | Sets byte to 1 if OF=1. |
| setp | SETP r/m8 | Legacy | Base | Sets byte to 1 if PF=1 (Even Parity). |
| sets | SETS r/m8 | Legacy | Base | Sets byte to 1 if SF=1 (Negative). |
| setz | SETZ r/m8 | Legacy | Base | Sets byte to 1 if ZF=1. |
| sfence | SFENCE | SSE | SSE | Serializes store operations (Wait for prior stores to complete). |
| sgdt | SGDT m | System | System | Stores GDT limit and base address to memory. |
| sha1msg1 | SHA1MSG1 xmm1, xmm2/m128 | Legacy | SHA | Performs intermediate calculation for SHA1 message schedule. |
| sha1msg2 | SHA1MSG2 xmm1, xmm2/m128 | Legacy | SHA | Performs final calculation for SHA1 message schedule. |
| sha1nexte | SHA1NEXTE xmm1, xmm2/m128 | Legacy | SHA | Calculates SHA1 state variable E. |
| sha1rnds4 | SHA1RNDS4 xmm1, xmm2/m128, imm8 | Legacy | SHA | Performs 4 rounds of SHA1 operation. |
| sha256msg1 | SHA256MSG1 xmm1, xmm2/m128 | Legacy | SHA | Performs intermediate calculation for SHA256 message schedule. |
| sha256msg2 | SHA256MSG2 xmm1, xmm2/m128 | Legacy | SHA | Performs final calculation for SHA256 message schedule. |
| sha256rnds2 | SHA256RNDS2 xmm1, xmm2/m128, xmm0 | Legacy | SHA | Performs 2 rounds of SHA256 operation. |
| shl | SHL r/m, imm8 | Legacy | Base | Shifts bits left (same as SAL). |
| shld | SHLD r/m, r, imm8 | Legacy | Base | Shifts dest left, filling with bits from src. |
| shlx | SHLX r32, r/m32, r32 | VEX | BMI2 | Logical left shift, count in register. No flags update. |
| shr | SHR r/m, imm8 | Legacy | Base | Shifts bits right, filling with zeros. |
| shrd | SHRD r/m, r, imm8 | Legacy | Base | Shifts dest right, filling with bits from src. |
| shrx | SHRX r32, r/m32, r32 | VEX | BMI2 | Logical right shift, count in register. No flags update. |
| shufpd | SHUFPD xmm1, xmm2/m128, imm8 | SSE2 | SSE2 | Shuffles 64-bit doubles between two XMM registers. |
| shufpd | SHUFPD xmm, xmm/m128, imm8 | SSE2 | SSE2 | Shuffles 64-bit doubles based on immediate mask. |
| shufps | SHUFPS xmm1, xmm2/m128, imm8 | SSE | SSE | Shuffles 32-bit floats between two XMM registers. |
| shufps | SHUFPS xmm, xmm/m128, imm8 | SSE | SSE | Shuffles 32-bit floats based on immediate mask. |
| sidt | SIDT m | System | System | Stores IDT limit and base address to memory. |
| sldt | SLDT r/m16 | System | System | Stores LDT segment selector. |
| smsw | SMSW r/m16 | System | System | Stores Machine Status Word. |
| sqrtpd | SQRTPD xmm, xmm/m128 | SSE2 | SSE2 | Computes square root of two 64-bit doubles. |
| sqrtps | SQRTPS xmm, xmm/m128 | SSE | SSE | Computes square root of four 32-bit floats. |
| sqrtsd | SQRTSD xmm1, xmm2/m64 | SSE2 | SSE2 | Computes square root of the low double. |
| sqrtss | SQRTSS xmm1, xmm2/m32 | SSE | SSE | Computes square root of the low float. |
| stac | STAC | Legacy | SMAP | Sets Alignment Check flag (Allow user memory access). |
| stc | STC | Legacy | Base | Sets the CF flag to 1. |
| std | STD | Legacy | Base | Sets DF to 1 (String operations decrement). |
| stgi | STGI | SVM | SVM | Enables global interrupts (AMD SVM). |
| sti | STI | Legacy | Base | Enables maskable hardware interrupts. |
| stmxcsr | STMXCSR m32 | SSE | SSE | Stores the MXCSR register to memory. |
| stos | STOSB | Legacy | Base | Stores AL/AX/EAX to memory at [EDI]. |
| stosd | STOSD | Legacy | Base | Stores EAX to memory at [EDI]. |
| stosq | STOSQ | Legacy | Base (64-bit) | Stores RAX to memory at [RDI]. |
| stosw | STOSW | Legacy | Base | Stores AX to memory at [EDI]. |
| str | STR r/m16 | System | System | Stores Task Register. |
| sttilecfg | STTILECFG m512 | VEX | AMX-TILE | Stores AMX tile configuration to memory. |
| stui | STUI | Legacy | UINTR | Sets the User Interrupt Flag (UIF). |
| sub | SUB r/m, r | Legacy | Base | Subtracts source from destination. |
| sub | SUB r/m, r | Legacy | Base | Subtracts src from dest. |
| subpd | SUBPD xmm, xmm/m128 | SSE2 | SSE2 | Subtracts two 64-bit doubles. |
| subps | SUBPS xmm, xmm/m128 | SSE | SSE | Subtracts four 32-bit floats. |
| subsd | SUBSD xmm1, xmm2/m64 | SSE2 | SSE2 | Subtracts the low double-precision floating-point value. |
| subss | SUBSS xmm1, xmm2/m32 | SSE | SSE | Subtracts the low single-precision floating-point value. |
| swapgs | SWAPGS | Legacy | Base (64-bit System) | Swaps user/kernel GS base address (System). |
| syscall | SYSCALL | System | System (64-bit) | Fast call to privilege level 0 system procedures. |
| sysenter | SYSENTER | System | System | Fast call to level 0 system procedures. |
| sysexit | SYSEXIT | System | System | Fast return to level 3 user code. |
| sysret | SYSRET | System | System (64-bit) | Fast return to privilege level 3 user code. |
| t1mskc | T1MSKC r32, r/m32 | TBM | TBM | Creates mask from trailing ones (~x | (x+1)). |
| tdpbf16ps | TDPBF16PS tmm1, tmm2, tmm3 | VEX | AMX-BF16 | Matrix multiply (BFloat16) accumulating to Float32. |
| tdpbssd | TDPBSSD tmm1, tmm2, tmm3 | VEX | AMX-INT8 | Matrix multiply (Signed Int8 * Signed Int8) accumulating to Int32. |
| tdpbsud | TDPBSUD tmm1, tmm2, tmm3 | VEX | AMX-INT8 | Matrix multiply (Signed * Unsigned) accumulating to Int32. |
| tdpbusd | TDPBUSD tmm1, tmm2, tmm3 | VEX | AMX-INT8 | Matrix multiply (Unsigned * Signed) accumulating to Int32. |
| tdpbuud | TDPBUUD tmm1, tmm2, tmm3 | VEX | AMX-INT8 | Matrix multiply (Unsigned * Unsigned) accumulating to Int32. |
| tdpfp16ps | TDPFP16PS tmm1, tmm2, tmm3 | VEX | AMX-FP16 | Matrix multiply (FP16 * FP16) accumulating to Float32. |
| test | TEST r/m, r | Legacy | Base | ANDs operands and updates flags (result discarded). |
| test | TEST r/m, r | Legacy | Base | ANDs operands and updates flags (result discarded). |
| testui | TESTUI | Legacy | UINTR | Sets CF if UIF is 1, ZF if User Interrupt Pending. |
| tileloadd | TILELOADD tmm1, m | VEX | AMX-TILE | Loads data into an AMX tile register. |
| tileloaddt1 | TILELOADDT1 tmm1, m | VEX | AMX-TILE | Loads data into an AMX tile register with T1 hint. |
| tilestored | TILESTORED m, tmm1 | VEX | AMX-TILE | Stores data from an AMX tile register to memory. |
| tilezero | TILEZERO tmm1 | VEX | AMX-TILE | Clears an AMX tile register. |
| tpause | TPAUSE r32 | Legacy | WAITPKG | Pauses execution for a specified time or until trigger. |
| tsxldtrk | TSXLDTRK | Legacy | TSXLDTRK | Suspends/Resumes tracking of load operations in TSX. |
| tzcnt | TZCNT r32, r/m32 | Legacy | BMI1 | Counts the number of trailing zeros. |
| tzmsk | TZMSK r32, r/m32 | TBM | TBM | Creates mask from trailing zeros (~x & (x-1)). |
| ucomisd | UCOMISD xmm1, xmm2/m64 | SSE2 | SSE2 | Compares low double and sets EFLAGS. |
| ucomisd | UCOMISD xmm, xmm/m64 | SSE2 | SSE2 | Compares low double and sets EFLAGS. |
| ucomiss | UCOMISS xmm1, xmm2/m32 | SSE | SSE | Compares low float and sets EFLAGS. |
| ucomiss | UCOMISS xmm, xmm/m32 | SSE | SSE | Compares low float and sets EFLAGS. |
| ud0 | UD0 | Legacy | Base | Generates invalid opcode exception. |
| ud2 | UD2 | Legacy | Base | Generates an invalid opcode exception. |
| uiret | UIRET | Legacy | UINTR | Returns from a User Interrupt handler. |
| umonitor | UMONITOR r64 | Legacy | WAITPKG | Sets up a monitor address for User Wait instructions. |
| umwait | UMWAIT r32 | Legacy | WAITPKG | Waits for store to monitored address (Low power state). |
| unpckhpd | UNPCKHPD xmm1, xmm2/m128 | SSE2 | SSE2 | Interleaves high doubles from two sources. |
| unpckhps | UNPCKHPS xmm1, xmm2/m128 | SSE | SSE | Interleaves high floats from two registers. |
| unpcklpd | UNPCKLPD xmm1, xmm2/m128 | SSE2 | SSE2 | Interleaves low doubles from two sources. |
| unpcklps | UNPCKLPS xmm1, xmm2/m128 | SSE | SSE | Interleaves low floats from two registers. |
| v4fmaddps | V4FMADDPS zmm1 {k1}, zmm2+3, m128 | EVEX | AVX-512-4FMAPS | 4-way FMA for Neural Nets (Single). |
| v4fmaddss | V4FMADDSS xmm1 {k1}, xmm2+3, m128 | EVEX | AVX-512-4FMAPS | 4-way FMA for Neural Nets (Scalar). |
| v4fnmaddps | V4FNMADDPS zmm1 {k1}, zmm2+3, m128 | EVEX | AVX-512-4FMAPS | 4-way Negative FMA for Neural Nets (Single). |
| v4fnmaddss | V4FNMADDSS xmm1 {k1}, xmm2+3, m128 | EVEX | AVX-512-4FMAPS | 4-way Negative FMA for Neural Nets (Scalar). |
| vaddph | VADDPH zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-FP16 | Adds half-precision floating-point values. |
| vaddps | VADDPS ymm1, ymm2, ymm3/m256 | VEX | AVX | Adds packed floats (256-bit YMM support). |
| vaddsh | VADDSH xmm1 {k1}, xmm2, xmm3/m16 | EVEX | AVX-512-FP16 | Adds low FP16 value. |
| vaddss | VADDSS xmm1 {k1}, xmm2, xmm3/m32 | EVEX | AVX-512F | Adds scalar single precision (EVEX encoded with masking). |
| vaesdec | VAESDEC zmm1, zmm2, zmm3/m512 | EVEX | AVX-512-VAES | AES Decrypt on 512-bit vector. |
| vaesenc | VAESENC zmm1, zmm2, zmm3/m512 | EVEX | AVX-512-VAES | AES Encrypt on 512-bit vector. |
| valignd | VALIGND zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512F | Extracts 512-bits from two concatenated ZMMs shifted by count. |
| valignq | VALIGNQ zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512F | Extracts 512-bits from two concatenated ZMMs shifted by count. |
| vbroadcastf128 | VBROADCASTF128 ymm1, m128 | VEX | AVX | Broadcasts 128-bit FP block to YMM. |
| vbroadcasti128 | VBROADCASTI128 ymm1, m128 | VEX | AVX2 | Broadcasts 128-bit integer block to YMM. |
| vbroadcastsd | VBROADCASTSD ymm1, m64 | VEX | AVX2 | Broadcasts a double to all elements of YMM. |
| vbroadcastss | VBROADCASTSS ymm1, m32 | AVX | AVX | Loads one float and replicates it to all YMM elements. |
| vcmppd | VCMPPD ymm1, ymm2, ymm3/m256, imm8 | VEX | AVX | Compares packed doubles (AVX version with immediate). |
| vcmpps | VCMPPS ymm1, ymm2, ymm3/m256, imm8 | VEX | AVX | Compares packed floats (AVX version with immediate). |
| vcompresspd | VCOMPRESSPD m512 {k1}, zmm1 | EVEX | AVX-512F | Compresses active elements from ZMM to memory. |
| vcompressps | VCOMPRESSPS m512 {k1}, zmm1 | EVEX | AVX-512F | Compresses active elements from ZMM to memory. |
| vcvtdq2ps | VCVTDQ2PS ymm1, ymm2/m256 | VEX | AVX | Converts four 32-bit integers to floats. |
| vcvtne2ps2bf16 | VCVTNE2PS2BF16 zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-BF16 | Converts two float vectors to one BFloat16 vector. |
| vcvtpd2udq | VCVTPD2UDQ ymm1 {k1}, zmm2/m512 | EVEX | AVX-512F | Converts 64-bit doubles to unsigned 32-bit integers. |
| vcvtpd2uq | VCVTPD2UQ zmm1 {k1}, zmm2/m512 | EVEX | AVX-512F | Converts 64-bit doubles to unsigned 64-bit integers. |
| vcvtph2ps | VCVTPH2PS xmm1, xmm2/m64 | VEX | F16C | Converts half-precision floats to single-precision. |
| vcvtps2dq | VCVTPS2DQ ymm1, ymm2/m256 | VEX | AVX | Converts four floats to 32-bit integers (Rounded). |
| vcvtps2ph | VCVTPS2PH xmm1/m64, xmm2, imm8 | VEX | F16C | Converts single-precision floats to half-precision. |
| vcvtps2udq | VCVTPS2UDQ zmm1 {k1}, zmm2/m512 | EVEX | AVX-512F | Converts 32-bit floats to unsigned 32-bit integers. |
| vcvtps2uq | VCVTPS2UQ zmm1 {k1}, ymm2/m256 | EVEX | AVX-512F | Converts 32-bit floats to unsigned 64-bit integers. |
| vcvttps2dq | VCVTTPS2DQ ymm1, ymm2/m256 | VEX | AVX | Converts four floats to 32-bit integers (Truncated). |
| vcvtudq2pd | VCVTUDQ2PD zmm1 {k1}, ymm2/m256 | EVEX | AVX-512F | Converts unsigned 32-bit integers to 64-bit doubles. |
| vcvtudq2ps | VCVTUDQ2PS zmm1 {k1}, zmm2/m512 | EVEX | AVX-512F | Converts unsigned int32 to float. |
| vcvtuq2pd | VCVTUQ2PD zmm1 {k1}, zmm2/m512 | EVEX | AVX-512F | Converts unsigned 64-bit integers to 64-bit doubles. |
| vcvtuq2ps | VCVTUQ2PS ymm1 {k1}, zmm2/m512 | EVEX | AVX-512F | Converts unsigned 64-bit integers to 32-bit floats. |
| vdbpsadbw | VDBPSADBW zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512BW | Computes SAD on 16-bit blocks. |
| vdivph | VDIVPH zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-FP16 | Divides half-precision floating-point values. |
| vdivsh | VDIVSH xmm1 {k1}, xmm2, xmm3/m16 | EVEX | AVX-512-FP16 | Divides low FP16 value. |
| vdpbf16ps | VDPBF16PS zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-BF16 | BFloat16 dot product accumulating to Float32. |
| verr | VERR r/m16 | System | System | Checks if segment can be read; sets ZF. |
| verw | VERW r/m16 | System | System | Checks if segment can be written; sets ZF. |
| vexpandpd | VEXPANDPD zmm1 {k1}, m512 | EVEX | AVX-512F | Expands data from memory into sparse locations in ZMM. |
| vexpandps | VEXPANDPS zmm1 {k1}, m512 | EVEX | AVX-512F | Expands data from memory into sparse locations in ZMM. |
| vextractf128 | VEXTRACTF128 xmm1/m128, ymm2, imm8 | VEX | AVX | Extracts 128-bits from YMM register. |
| vextracti128 | VEXTRACTI128 xmm1/m128, ymm2, imm8 | VEX | AVX2 | Extracts 128-bits of integer data from YMM. |
| vfcmaddcph | VFCMADDCPH zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-FP16 | Complex conjugate multiply-add for half-precision. |
| vfixupimmpd | VFIXUPIMMPD zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512F | Fixes special cases (NaN, Inf) using a table. |
| vfixupimmps | VFIXUPIMMPS zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512F | Fixes special cases (NaN, Inf) using a table (Float32). |
| vfixupimmss | VFIXUPIMMSS xmm1 {k1}, xmm2, xmm3/m32, imm8 | EVEX | AVX-512F | Fixes special cases (NaN, Inf) in low float using table. |
| vfmadd132ph | VFMADD132PH zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-FP16 | Computes (Dest * Src2) + Src1 in half-precision. |
| vfmadd132ps | VFMADD132PS ymm1, ymm2, ymm3/m256 | VEX | FMA3 | Computes (Dest * Src2) + Src1. |
| vfmadd132sh | VFMADD132SH xmm1 {k1}, xmm2, xmm3/m16 | EVEX | AVX-512-FP16 | Scalar FMA (Dest * Src2 + Src1) for FP16. |
| vfmadd132ss | VFMADD132SS xmm1, xmm2, xmm3/m32 | FMA3 | FMA3 | Scalar FMA: Dest = (Dest * Src2) + Src1. |
| vfmadd213ph | VFMADD213PH zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-FP16 | Computes (Src1 * Dest) + Src2 in half-precision. |
| vfmadd213ps | VFMADD213PS ymm1, ymm2, ymm3/m256 | VEX | FMA3 | Computes (Src1 * Dest) + Src2. |
| vfmadd213ss | VFMADD213SS xmm1, xmm2, xmm3/m32 | FMA3 | FMA3 | Scalar FMA: Dest = (Src1 * Dest) + Src2. |
| vfmadd231ph | VFMADD231PH zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-FP16 | Computes (Src1 * Src2) + Dest in half-precision. |
| vfmadd231ps | VFMADD231PS ymm1, ymm2, ymm3/m256 | VEX | FMA3 | Computes (Dest * Src2) + Src1. |
| vfmadd231ss | VFMADD231SS xmm1, xmm2, xmm3/m32 | FMA3 | FMA3 | Scalar FMA: Dest = (Src1 * Src2) + Dest. |
| vfmaddcph | VFMADDCPH zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-FP16 | Complex multiply-add for half-precision. |
| vfmaddcsh | VFMADDCSH xmm1 {k1}, xmm2, xmm3/m32 | EVEX | AVX-512-FP16 | Complex multiply-add for scalar half-precision. |
| vfmsub132ps | VFMSUB132PS ymm1, ymm2, ymm3/m256 | VEX | FMA3 | Computes (Dest * Src2) - Src1. |
| vfnmadd132ps | VFNMADD132PS ymm1, ymm2, ymm3/m256 | VEX | FMA3 | Computes -(Dest * Src2) + Src1. |
| vfpclasspd | VFPCLASSPD k1 {k2}, zmm2/m512, imm8 | EVEX | AVX-512DQ | Tests for category (NaN, Inf, Denormal) for doubles. |
| vfpclassps | VFPCLASSPS k1 {k2}, zmm2/m512, imm8 | EVEX | AVX-512DQ | Tests for category (NaN, Inf, Denormal) for floats. |
| vgatherdpd | VGATHERDPD ymm1, [base+xmm_idx*scale], ymm_mask | VEX | AVX2 | Loads doubles from non-contiguous memory using indices. |
| vgatherdps | VGATHERDPS ymm1, [base+ymm_idx*scale], ymm_mask | VEX | AVX2 | Loads floats from non-contiguous memory using indices. |
| vgatherpf0dpd | VGATHERPF0DPD {k1}, [base+ymm_idx] | EVEX | AVX-512PF | Prefetches doubles to L1 cache using indices. |
| vgatherpf0dps | VGATHERPF0DPS {k1}, [base+zmm_idx] | EVEX | AVX-512PF | Prefetches floats to L1 cache using indices. |
| vgatherpf0qpd | VGATHERPF0QPD {k1}, [base+zmm_idx] | EVEX | AVX-512PF | Prefetches doubles to L1 using 64-bit indices. |
| vgatherpf0qps | VGATHERPF0QPS {k1}, [base+zmm_idx] | EVEX | AVX-512PF | Prefetches floats to L1 using 64-bit indices. |
| vgetexppd | VGETEXPPD zmm1 {k1}, zmm2/m512 | EVEX | AVX-512F | Extracts exponents from doubles as float values. |
| vgetexpss | VGETEXPSS xmm1 {k1}, xmm2, xmm3/m32 | EVEX | AVX-512F | Extracts exponent from low float. |
| vgetmantpd | VGETMANTPD zmm1 {k1}, zmm2/m512, imm8 | EVEX | AVX-512F | Extracts mantissas from doubles. |
| vgetmantsd | VGETMANTSD xmm1 {k1}, xmm2, xmm3/m64, imm8 | EVEX | AVX-512F | Extracts mantissa from low double. |
| vinsertf128 | VINSERTF128 ymm1, ymm2, xmm3/m128, imm8 | VEX | AVX | Inserts 128-bits into a YMM register. |
| vinserti128 | VINSERTI128 ymm1, ymm2, xmm3/m128, imm8 | VEX | AVX2 | Inserts 128-bits of integer data into a YMM register. |
| vmaskmovpd | VMASKMOVPD ymm1, ymm2, m256 | VEX | AVX | Conditionally loads/stores doubles based on mask. |
| vmaskmovps | VMASKMOVPS ymm1, ymm2, m256 | VEX | AVX | Conditionally loads/stores floats based on mask. |
| vmaxph | VMAXPH zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-FP16 | Maximum of half-precision values. |
| vmcall | VMCALL | VMX | VMX | Guest VM calls the Hypervisor (VM Exit). |
| vmclear | VMCLEAR m64 | VMX | VMX | Initializes a VMCS region in memory. |
| vmfunc | VMFUNC | VMX | VMX | Invoke VM function specified in EAX. |
| vminph | VMINPH zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-FP16 | Minimum of half-precision values. |
| vmlaunch | VMLAUNCH | VMX | VMX | Launches a VM managed by the current VMCS. |
| vmload | VMLOAD | SVM | SVM | Loads processor state from VMCB (AMD SVM). |
| vmoff | VMOFF | VMX | VMX | Leaves VMX root operation. |
| vmptrld | VMPTRLD m64 | VMX | VMX | Loads the current VMCS pointer from memory. |
| vmptrst | VMPTRST m64 | VMX | VMX | Stores the current VMCS pointer to memory. |
| vmread | VMREAD r/m64, r64 | VMX | VMX | Reads a field from the Virtual Machine Control Structure. |
| vmresume | VMRESUME | VMX | VMX | Resumes a VM from the current VMCS. |
| vmrun | VMRUN | SVM | SVM | Switch to guest VM (AMD SVM). |
| vmsave | VMSAVE | SVM | SVM | Saves processor state to VMCB (AMD SVM). |
| vmulph | VMULPH zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-FP16 | Multiplies half-precision floating-point values. |
| vmulps | VMULPS ymm1, ymm2, ymm3/m256 | VEX | AVX | Multiplies packed floats (256-bit). |
| vmulsh | VMULSH xmm1 {k1}, xmm2, xmm3/m16 | EVEX | AVX-512-FP16 | Multiplies low FP16 value. |
| vmulss | VMULSS xmm1 {k1}, xmm2, xmm3/m32 | EVEX | AVX-512F | Multiplies scalar single precision (EVEX encoded with masking). |
| vmwrite | VMWRITE r64, r/m64 | VMX | VMX | Writes a field to the Virtual Machine Control Structure. |
| vmxon | VMXON m64 | VMX | VMX | Enters VMX root operation (Host Mode). |
| vp2intersectd | VP2INTERSECTD k1+1, zmm2, zmm3/m512 | EVEX | AVX-512-VP2INTERSECT | Computes intersection of two ZMM registers into mask pair. |
| vp2intersectq | VP2INTERSECTQ k1+1, zmm2, zmm3/m512 | EVEX | AVX-512-VP2INTERSECT | Computes intersection of two ZMM registers into mask pair. |
| vp4dpwssd | VP4DPWSSD zmm1 {k1}, zmm2+3, m128 | EVEX | AVX-512-4VNNIW | Neural Net 4-way dot product. |
| vp4dpwssds | VP4DPWSSDS zmm1 {k1}, zmm2+3, m128 | EVEX | AVX-512-4VNNIW | Neural Net 4-way dot product with saturation. |
| vpabsd | VPABSD zmm1 {k1}, zmm2/m512 | EVEX | AVX-512F | Computes absolute value of 32-bit integers. |
| vpabsq | VPABSQ zmm1 {k1}, zmm2/m512 | EVEX | AVX-512F | Computes absolute value of 64-bit integers. |
| vpaddb | VPADDB ymm1, ymm2, ymm3/m256 | VEX | AVX2 | Adds 32 bytes (256-bit). |
| vpaddd | VPADDD ymm1, ymm2, ymm3/m256 | VEX | AVX2 | Adds 8 integers (256-bit). |
| vpbroadcastb | VPBROADCASTB ymm1, xmm2/m8 | VEX | AVX2 | Broadcasts a byte from memory/register to all elements of YMM. |
| vpbroadcastd | VPBROADCASTD ymm1, xmm2/m32 | AVX2 | AVX2 | Loads one integer and replicates it to all YMM elements. |
| vpbroadcastq | VPBROADCASTQ ymm1, xmm2/m64 | VEX | AVX2 | Broadcasts a quadword to all elements of YMM. |
| vpbroadcastw | VPBROADCASTW ymm1, xmm2/m16 | VEX | AVX2 | Broadcasts a word to all elements of YMM. |
| vpclmulqdq | VPCLMULQDQ zmm1, zmm2, zmm3/m512, imm8 | EVEX | AVX-512-VPCLMULQDQ | Carry-less multiply on 512-bit vector. |
| vpcmov | VPCMOV xmm1, xmm2, xmm3, xmm4 | XOP | XOP | Bitwise conditional move based on selector. |
| vpcmpb | VPCMPB k1 {k2}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512BW | Compares bytes and stores result in k-register mask. |
| vpcmpd | VPCMPD k1 {k2}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512F | Compares doublewords and stores result in k-register mask. |
| vpcmpq | VPCMPQ k1 {k2}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512F | Compares quadwords and stores result in k-register mask. |
| vpcmpub | VPCMPUB k1 {k2}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512BW | Compares unsigned bytes and stores result in k-register. |
| vpcmpud | VPCMPUD k1 {k2}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512F | Compares unsigned doublewords and stores result in k-register. |
| vpcmpuq | VPCMPUQ k1 {k2}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512F | Compares unsigned quadwords and stores result in k-register. |
| vpcmpuw | VPCMPUW k1 {k2}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512BW | Compares unsigned words and stores result in k-register. |
| vpcmpw | VPCMPW k1 {k2}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512BW | Compares words and stores result in k-register mask. |
| vpcomb | VPCOMB xmm1, xmm2, xmm3/m128, imm8 | XOP | XOP | Compares bytes using immediate condition. |
| vpcompressb | VPCOMPRESSB m512 {k1}, zmm1 | EVEX | AVX-512-VBMI2 | Compresses active bytes from ZMM to memory. |
| vpcompressw | VPCOMPRESSW m512 {k1}, zmm1 | EVEX | AVX-512-VBMI2 | Compresses active words from ZMM to memory. |
| vpconflictd | VPCONFLICTD zmm1 {k1}, zmm2/m512 | EVEX | AVX-512CD | Detects duplicate values in a vector (Conflict Detection). |
| vpconflictq | VPCONFLICTQ zmm1 {k1}, zmm2/m512 | EVEX | AVX-512CD | Detects duplicate values in a quadword vector. |
| vpdpbusd | VPDPBUSD zmm1, zmm2, zmm3/m512 | EVEX | AVX-512-VNNI | Dot product of unsigned/signed bytes, accum to dword. |
| vpdpbusds | VPDPBUSDS zmm1, zmm2, zmm3/m512 | EVEX | AVX-512-VNNI | Dot product of unsigned/signed bytes, accum to dword (Saturate). |
| vpdpwssd | VPDPWSSD zmm1, zmm2, zmm3/m512 | EVEX | AVX-512-VNNI | Dot product of signed words, accum to dword. |
| vpdpwssds | VPDPWSSDS zmm1, zmm2, zmm3/m512 | EVEX | AVX-512-VNNI | Dot product of signed words, accum to dword (Saturate). |
| vperm2f128 | VPERM2F128 ymm1, ymm2, ymm3/m256, imm8 | VEX | AVX | Shuffles 128-bit float lanes between YMM registers. |
| vperm2i128 | VPERM2I128 ymm1, ymm2, ymm3/m256, imm8 | VEX | AVX2 | Shuffles two 128-bit lanes between registers. |
| vpermb | VPERMB zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-VBMI | Permutes bytes in ZMM based on index vector. |
| vpermd | VPERMD ymm1, ymm2, ymm3/m256 | VEX | AVX2 | Full permutation of 8 integers using indices from a register. |
| vpermi2b | VPERMI2B zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-VBMI | Shuffles bytes from two ZMM registers into destination. |
| vpermi2d | VPERMI2D zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Shuffles doublewords from two ZMM registers into destination. |
| vpermi2q | VPERMI2Q zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Shuffles quadwords from two ZMM registers into destination. |
| vpermilpd | VPERMILPD ymm1, ymm2/m256, imm8 | AVX | AVX | Shuffles doubles within 128-bit lanes (AVX). |
| vpermilps | VPERMILPS ymm1, ymm2/m256, imm8 | AVX | AVX | Shuffles floats within 128-bit lanes (AVX). |
| vpermps | VPERMPS ymm1, ymm2, ymm3/m256 | VEX | AVX2 | Full permutation of 8 floats using indices. |
| vpermq | VPERMQ ymm1, ymm2/m256, imm8 | VEX | AVX2 | Shuffles quadwords within 256-bit lanes using immediate. |
| vpermt2b | VPERMT2B zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-VBMI | Shuffles bytes from two sources, overwriting index. |
| vpermt2d | VPERMT2D zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Shuffles 2 sources, overwriting the index register. |
| vpermt2q | VPERMT2Q zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Shuffles 2 sources, overwriting the index register (Quadword). |
| vpermw | VPERMW zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512BW | Full permutation of 32 words using indices. |
| vpexpandb | VPEXPANDB zmm1 {k1}, m512 | EVEX | AVX-512-VBMI2 | Expands bytes from memory into sparse locations in ZMM. |
| vpexpandw | VPEXPANDW zmm1 {k1}, m512 | EVEX | AVX-512-VBMI2 | Expands words from memory into sparse locations in ZMM. |
| vpgatherdd | VPGATHERDD ymm1, [base+ymm_idx*scale], ymm_mask | VEX | AVX2 | Gathers 32-bit integers using 32-bit indices. |
| vpgatherdq | VPGATHERDQ ymm1, [base+xmm_idx*scale], ymm_mask | VEX | AVX2 | Gathers 64-bit integers using 32-bit indices. |
| vpgatherqd | VPGATHERQD xmm1, [base+ymm_idx*scale], xmm_mask | VEX | AVX2 | Gathers 32-bit integers using 64-bit indices. |
| vpgatherqq | VPGATHERQQ ymm1, [base+ymm_idx*scale], ymm_mask | VEX | AVX2 | Gathers 64-bit integers using 64-bit indices. |
| vphaddbd | VPHADDBD xmm1, xmm2/m128 | XOP | XOP | Adds adjacent bytes to doublewords. |
| vphaddbq | VPHADDBQ xmm1, xmm2/m128 | XOP | XOP | Adds adjacent bytes to quadwords. |
| vphaddbw | VPHADDBW xmm1, xmm2/m128 | XOP | XOP | Adds adjacent bytes to words. |
| vphadddq | VPHADDDQ xmm1, xmm2/m128 | XOP | XOP | Adds adjacent doublewords to quadwords. |
| vphaddwd | VPHADDWD xmm1, xmm2/m128 | XOP | XOP | Adds adjacent words to doublewords. |
| vphaddwq | VPHADDWQ xmm1, xmm2/m128 | XOP | XOP | Adds adjacent words to quadwords. |
| vplzcntd | VPLZCNTD zmm1 {k1}, zmm2/m512 | EVEX | AVX-512CD | Counts leading zeros for each doubleword element. |
| vplzcntq | VPLZCNTQ zmm1 {k1}, zmm2/m512 | EVEX | AVX-512CD | Counts leading zeros for each quadword element. |
| vpmacssww | VPMACSSWW xmm1, xmm2, xmm3, xmm4 | XOP | XOP | Multiply-accumulate signed words with saturation. |
| vpmacsww | VPMACSWW xmm1, xmm2, xmm3, xmm4 | XOP | XOP | Multiply-accumulate signed words. |
| vpmadd52huq | VPMADD52HUQ zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-IFMA | Fused multiply-add for 52-bit integers (High 52 bits). |
| vpmadd52luq | VPMADD52LUQ zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-IFMA | Fused multiply-add for 52-bit integers (Low 52 bits). |
| vpmaxsq | VPMAXSQ zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Returns maximum of signed 64-bit integers. |
| vpmaxuq | VPMAXUQ zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Returns maximum of unsigned 64-bit integers. |
| vpminsq | VPMINSQ zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Returns minimum of signed 64-bit integers. |
| vpminuq | VPMINUQ zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Returns minimum of unsigned 64-bit integers. |
| vpmovb2m | VPMOVB2M k1, zmm1 | EVEX | AVX-512BW | Moves byte integer mask from ZMM to k-register. |
| vpmovd2m | VPMOVD2M k1, zmm1 | EVEX | AVX-512DQ | Moves doubleword integer mask from ZMM to k-register. |
| vpmovdb | VPMOVDB xmm1/m128 {k1}, zmm2 | EVEX | AVX-512F | Down-converts 32-bit integers to 8-bit. |
| vpmovm2b | VPMOVM2B zmm1, k1 | EVEX | AVX-512BW | Expands k-register bits to byte elements in ZMM. |
| vpmovm2d | VPMOVM2D zmm1, k1 | EVEX | AVX-512DQ | Expands k-register bits to doubleword elements in ZMM. |
| vpmovm2q | VPMOVM2Q zmm1, k1 | EVEX | AVX-512DQ | Expands k-register bits to quadword elements in ZMM. |
| vpmovm2w | VPMOVM2W zmm1, k1 | EVEX | AVX-512BW | Expands k-register bits to word elements in ZMM. |
| vpmovq2m | VPMOVQ2M k1, zmm1 | EVEX | AVX-512DQ | Moves quadword integer mask from ZMM to k-register. |
| vpmovsqb | VPMOVSQB xmm1/m128 {k1}, zmm2 | EVEX | AVX-512F | Down-converts 64-bit integers to 8-bit signed saturate. |
| vpmovswb | VPMOVSWB xmm1/m128 {k1}, zmm2 | EVEX | AVX-512F | Down-converts 16-bit integers to 8-bit signed saturate. |
| vpmovusdb | VPMOVUSDB xmm1/m128 {k1}, zmm2 | EVEX | AVX-512F | Down-converts 32-bit to 8-bit with unsigned saturation. |
| vpmovusqb | VPMOVUSQB xmm1/m128 {k1}, zmm2 | EVEX | AVX-512F | Down-converts 64-bit integers to 8-bit unsigned saturate. |
| vpmovuswb | VPMOVUSWB xmm1/m128 {k1}, zmm2 | EVEX | AVX-512F | Down-converts 16-bit integers to 8-bit unsigned saturate. |
| vpmovw2m | VPMOVW2M k1, zmm1 | EVEX | AVX-512BW | Moves word integer mask from ZMM to k-register. |
| vpmulld | VPMULLD ymm1, ymm2, ymm3/m256 | VEX | AVX2 | Multiplies 8 integers (256-bit). |
| vpmullq | VPMULLQ zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512DQ | Multiplies 64-bit integers and keeps low 64-bit result. |
| vpmultishiftqb | VPMULTISHIFTQB zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-VBMI | Selects bytes from each 64-bit element based on shift control. |
| vpopcntb | VPOPCNTB zmm1 {k1}, zmm2/m512 | EVEX | AVX-512-BITALG | Counts set bits in each byte. |
| vpopcntd | VPOPCNTD zmm1 {k1}, zmm2/m512 | EVEX | AVX-512-VPOPCNTDQ | Counts set bits in each doubleword element. |
| vpopcntq | VPOPCNTQ zmm1 {k1}, zmm2/m512 | EVEX | AVX-512-VPOPCNTDQ | Counts set bits in each quadword element. |
| vpopcntw | VPOPCNTW zmm1 {k1}, zmm2/m512 | EVEX | AVX-512-BITALG | Counts set bits in each word element. |
| vprolq | VPROLQ zmm1 {k1}, zmm2, imm8 | EVEX | AVX-512F | Rotates 64-bit integers left. |
| vprolvd | VPROLVD zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Rotates doublewords left by amounts in second vector. |
| vprolvq | VPROLVQ zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Rotates quadwords left by amounts in second vector. |
| vprorq | VPRORQ zmm1 {k1}, zmm2, imm8 | EVEX | AVX-512F | Rotates 64-bit integers right. |
| vprorvd | VPRORVD zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Rotates doublewords right by amounts in second vector. |
| vprorvq | VPRORVQ zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Rotates quadwords right by amounts in second vector. |
| vprotb | VPROTB xmm1, xmm2/m128, imm8 | XOP | XOP | Rotates bytes in XMM register. |
| vprotd | VPROTD xmm1, xmm2/m128, imm8 | XOP | XOP | Rotates doublewords in XMM register. |
| vprotq | VPROTQ xmm1, xmm2/m128, imm8 | XOP | XOP | Rotates quadwords in XMM register. |
| vprotw | VPROTW xmm1, xmm2/m128, imm8 | XOP | XOP | Rotates words in XMM register. |
| vpshab | VPSHAB xmm1, xmm2/m128, imm8 | XOP | XOP | Shifts bytes arithmetically. |
| vpshad | VPSHAD xmm1, xmm2/m128, imm8 | XOP | XOP | Shifts doublewords arithmetically. |
| vpshaq | VPSHAQ xmm1, xmm2/m128, imm8 | XOP | XOP | Shifts quadwords arithmetically. |
| vpshaw | VPSHAW xmm1, xmm2/m128, imm8 | XOP | XOP | Shifts words arithmetically. |
| vpshldd | VPSHLDD zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512-VBMI2 | Funnel shift left of doublewords. |
| vpshldq | VPSHLDQ zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512-VBMI2 | Funnel shift left of quadwords. |
| vpshldw | VPSHLDW zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512-VBMI2 | Funnel shift left of words. |
| vpshrdd | VPSHRDD zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512-VBMI2 | Funnel shift right of doublewords. |
| vpshrdq | VPSHRDQ zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512-VBMI2 | Funnel shift right of quadwords. |
| vpshrdw | VPSHRDW zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512-VBMI2 | Funnel shift right of words. |
| vpshufb | VPSHUFB ymm1, ymm2, ymm3/m256 | VEX | AVX2 | Shuffles 32 bytes based on indices. |
| vpshufbitqmb | VPSHUFBITQMB k1 {k2}, zmm2, zmm3/m512 | EVEX | AVX-512-BITALG | Extracts bits from bytes and packs into a mask register. |
| vpsllvd | VPSLLVD ymm1, ymm2, ymm3/m256 | AVX2 | AVX2 | Shifts doublewords left by individual counts. |
| vpsllvq | VPSLLVQ ymm1, ymm2, ymm3/m256 | AVX2 | AVX2 | Shifts quadwords left by individual counts. |
| vpsravd | VPSRAVD ymm1, ymm2, ymm3/m256 | AVX2 | AVX2 | Shifts doublewords right arithmetic by individual counts. |
| vpsravq | VPSRAVQ zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Shifts quadwords right arithmetic by individual counts. |
| vpsrlvd | VPSRLVD ymm1, ymm2, ymm3/m256 | AVX2 | AVX2 | Shifts doublewords right logical by individual counts. |
| vpsrlvq | VPSRLVQ ymm1, ymm2, ymm3/m256 | AVX2 | AVX2 | Shifts quadwords right logical by individual counts. |
| vpsubd | VPSUBD ymm1, ymm2, ymm3/m256 | VEX | AVX2 | Subtracts 8 integers (256-bit). |
| vpternlogd | VPTERNLOGD zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512F | Performs one of 256 logical operations on 3 inputs. |
| vpternlogq | VPTERNLOGQ zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512F | Performs one of 256 logical operations on 3 quadwords. |
| vptestmb | VPTESTMB k1 {k2}, zmm2, zmm3/m512 | EVEX | AVX-512BW | Tests byte integers and sets k-register mask. |
| vptestmd | VPTESTMD k1 {k2}, zmm2, zmm3/m512 | EVEX | AVX-512F | Tests doubleword integers and sets k-register mask. |
| vptestmq | VPTESTMQ k1 {k2}, zmm2, zmm3/m512 | EVEX | AVX-512F | Tests quadword integers and sets k-register mask. |
| vptestmw | VPTESTMW k1 {k2}, zmm2, zmm3/m512 | EVEX | AVX-512BW | Tests word integers and sets k-register mask. |
| vrangeps | VRANGEPS zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512DQ | Calculates range (min/max/abs) of float values. |
| vrangess | VRANGESS xmm1 {k1}, xmm2, xmm3/m32, imm8 | EVEX | AVX-512DQ | Calculates range (min/max/abs) of low float. |
| vrcp14ps | VRCP14PS zmm1 {k1}, zmm2/m512 | EVEX | AVX-512F | Approximate 1/x with 2^-14 error. |
| vreduceps | VREDUCEPS zmm1 {k1}, zmm2/m512, imm8 | EVEX | AVX-512DQ | Performs reduction on floats (e.g. range reduction for trig). |
| vreducess | VREDUCESS xmm1 {k1}, xmm2, xmm3/m32, imm8 | EVEX | AVX-512DQ | Performs reduction on low float. |
| vrndscalepd | VRNDSCALEPD zmm1 {k1}, zmm2/m512, imm8 | EVEX | AVX-512F | Rounds doubles to integer values using imm8 control. |
| vrsqrt14ps | VRSQRT14PS zmm1 {k1}, zmm2/m512 | EVEX | AVX-512F | Approximate 1/sqrt(x) with 2^-14 error. |
| vscalefpd | VSCALEFPD zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512F | Scales doubles by exponents (x * 2^n). |
| vscatterdpd | VSCATTERDPD [base+zmm_idx*scale] {k1}, zmm1 | EVEX | AVX-512F | Stores doubles to non-contiguous memory locations. |
| vscatterdps | VSCATTERDPS [base+zmm_idx*scale] {k1}, zmm1 | EVEX | AVX-512F | Stores floats to non-contiguous memory locations. |
| vscatterpf0dpd | VSCATTERPF0DPD {k1}, [base+ymm_idx] | EVEX | AVX-512PF | Prefetches lines for scatter write (L1, Double). |
| vscatterpf0dps | VSCATTERPF0DPS {k1}, [base+zmm_idx] | EVEX | AVX-512PF | Prefetches cache lines for scatter write (L1). |
| vscatterpf0qpd | VSCATTERPF0QPD {k1}, [base+zmm_idx] | EVEX | AVX-512PF | Prefetches lines for scatter write (L1, Double, 64-bit idx). |
| vscatterpf0qps | VSCATTERPF0QPS {k1}, [base+zmm_idx] | EVEX | AVX-512PF | Prefetches lines for scatter write (L1, 64-bit idx). |
| vscatterqpd | VSCATTERQPD [base+zmm_idx*scale] {k1}, zmm1 | EVEX | AVX-512F | Stores doubles using 64-bit indices. |
| vscatterqps | VSCATTERQPS [base+zmm_idx*scale] {k1}, zmm1 | EVEX | AVX-512F | Stores floats using 64-bit indices. |
| vsha512msg1 | VSHA512MSG1 ymm1, xmm2 | EVEX | SHA512 | SHA512 intermediate calculation (AVX512). |
| vsha512msg2 | VSHA512MSG2 ymm1, ymm2 | EVEX | SHA512 | SHA512 final calculation (AVX512). |
| vsha512rnds2 | VSHA512RNDS2 ymm1, ymm2, xmm3 | EVEX | SHA512 | SHA512 2 rounds calculation (AVX512). |
| vshuff32x4 | VSHUFF32X4 zmm1 {k1}, zmm2, zmm3/m512, imm8 | EVEX | AVX-512F | Shuffles 128-bit blocks of single-precision floats. |
| vsm3msg1 | VSM3MSG1 xmm1, xmm2, xmm3 | VEX | SM3 | SM3 crypto message schedule part 1. |
| vsm3rnds2 | VSM3RNDS2 xmm1, xmm2, imm8 | VEX | SM3 | SM3 crypto 2 rounds. |
| vsm4e | VSM4E xmm1, xmm2 | VEX | SM4 | SM4 crypto encryption round. |
| vsm4key4 | VSM4KEY4 xmm1, xmm2 | VEX | SM4 | SM4 key generation. |
| vsqrtph | VSQRTPH zmm1 {k1}, zmm2/m512 | EVEX | AVX-512-FP16 | Square root of half-precision values. |
| vsqrtsh | VSQRTSH xmm1 {k1}, xmm2/m16 | EVEX | AVX-512-FP16 | Square root of low FP16 value. |
| vsubph | VSUBPH zmm1 {k1}, zmm2, zmm3/m512 | EVEX | AVX-512-FP16 | Subtracts half-precision floating-point values. |
| vsubsh | VSUBSH xmm1 {k1}, xmm2, xmm3/m16 | EVEX | AVX-512-FP16 | Subtracts low FP16 value. |
| vtestpd | VTESTPD xmm1, xmm2/m128 | AVX | AVX | Sets ZF/CF based on sign bit comparisons of doubles. |
| vtestps | VTESTPS xmm1, xmm2/m128 | AVX | AVX | Sets ZF/CF based on sign bit comparisons of floats. |
| vzeroall | VZEROALL | VEX | AVX | Clears all YMM registers. |
| vzeroupper | VZEROUPPER | VEX | AVX | Clears bits 128-255 of all YMM registers (Avoids AVX-SSE transition penalty). |
| wait | WAIT | Legacy | Base | Wait for FPU (same as FWAIT). |
| wbinvd | WBINVD | System | System | Writes back modified data and invalidates caches (Privileged). |
| wbnoinvd | WBNOINVD | Legacy | WBNOINVD | Writes back modified lines but keeps them valid in cache. |
| wrfsbase | WRFSBASE r64 | Legacy | FSGSBASE | Writes a register to the FS base address. |
| wrgsbase | WRGSBASE r64 | Legacy | FSGSBASE | Writes a register to the GS base address. |
| wrmsr | WRMSR | System | System | Writes EDX:EAX to MSR specified by ECX (Privileged). |
| wrpkru | WRPKRU | Legacy | PKU | Writes EAX/EDX to PKRU register. |
| xabort | XABORT imm8 | Legacy | RTM (TSX) | Forces an RTM abort. |
| xadd | XADD r/m, r | Legacy | Base | Exchanges dest and src, then loads sum into dest. |
| xbegin | XBEGIN rel | Legacy | RTM (TSX) | Specifies start of Restricted Transactional Memory region. |
| xchg | XCHG r/m, r | Legacy | Base | Exchanges content of two operands. |
| xend | XEND | Legacy | RTM (TSX) | Specifies end of RTM region. |
| xgetbv | XGETBV | Legacy | XSAVE | Reads the state of XCR0 (feature mask) into EDX:EAX. |
| xlat | XLAT m8 | Legacy | Base | Replaces AL with byte from table at [EBX+AL]. |
| xor | XOR r/m, r | Legacy | Base | Performs bitwise XOR. |
| xor | XOR r/m, r | Legacy | Base | Performs bitwise XOR. |
| xorps | XORPS xmm, xmm/m128 | SSE | SSE | Bitwise XOR of 128 bits (Used to clear registers). |
| xrstor | XRSTOR m | Legacy | XSAVE | Restores specified state components from memory. |
| xrstors | XRSTORS m | Legacy | XSAVES | Restores supervisor state components from memory (Compact). |
| xsave | XSAVE m | Legacy | XSAVE | Saves specified state components (AVX, SSE, etc.) to memory. |
| xsavec | XSAVEC m | Legacy | XSAVEC | Saves state components using compaction. |
| xsaveopt | XSAVEOPT m | Legacy | XSAVEOPT | Saves state components (optimized for Modified state). |
| xsaves | XSAVES m | Legacy | XSAVES | Saves supervisor state components to memory (Compact). |
| xsetbv | XSETBV | Legacy | XSAVE | Writes EDX:EAX to XCR0 (Enables/disables AVX/SSE states). |
| xtest | XTEST | Legacy | TSX | Sets ZF if processor is in transactional region. |