x86 Instructions

957 instructions — click any row to view encoding, pseudocode, and full documentation.

Mnemonic ⇅	Syntax	Format ⇅	Extension ⇅	Summary
aaa	AAA	Legacy	Base (Legacy)	Adjusts AL after addition for unpacked BCD.
aad	AAD imm8	Legacy	Base (Legacy)	Adjusts AX before division for unpacked BCD.
aadd	AADD m32, r32	VEX	RAO-INT	Atomically adds a value to a remote memory location.
aam	AAM imm8	Legacy	Base (Legacy)	Adjusts AX after multiply for unpacked BCD.
aand	AAND m32, r32	VEX	RAO-INT	Atomically ANDs a value to a remote memory location.
aas	AAS	Legacy	Base (Legacy)	Adjusts AL after subtraction for unpacked BCD.
adc	ADC r/m, r	Legacy	Base	Adds operands and the Carry Flag (CF).
adcx	ADCX r32, r/m32	Legacy	ADX	Adds with Carry Flag (distinct from ADC, affects CF only).
add	ADD r/m, r	Legacy	Base	Adds src to dest and stores result in dest.
addpd	ADDPD xmm, xmm/m128	SSE2	SSE2	Adds two 64-bit doubles.
addps	ADDPS xmm, xmm/m128	SSE	SSE	Adds four 32-bit floats.
addsd	ADDSD xmm, xmm/m64	SSE2	SSE2	Adds the low 64-bit double.
addss	ADDSS xmm, xmm/m32	SSE	SSE	Adds the low 32-bit float.
addsubpd	ADDSUBPD xmm1, xmm2/m128	SSE3	SSE3	Adds odd elements, subtracts even elements (Double).
addsubps	ADDSUBPS xmm1, xmm2/m128	SSE3	SSE3	Adds odd elements, subtracts even elements (Complex Math).
adox	ADOX r32, r/m32	Legacy	ADX	Adds with Overflow Flag (Parallel addition with ADCX).
aesdec	AESDEC xmm1, xmm2/m128	AES-NI	AES-NI	Performs one round of AES decryption flow.
aesdec128kl	AESDEC128KL m128, xmm	Legacy	KEYLOCKER	Decrypts data using Key Locker handle.
aesdec256kl	AESDEC256KL m128, xmm	Legacy	KEYLOCKER	Decrypts data using 256-bit Key Locker handle.
aesdecwide128kl	AESDECWIDE128KL m128	Legacy	KEYLOCKER_WIDE	Decrypts 8 blocks using 128-bit Key Locker handle.
aesdecwide256kl	AESDECWIDE256KL m128	Legacy	KEYLOCKER_WIDE	Decrypts 8 blocks using 256-bit Key Locker handle.
aesenc	AESENC xmm1, xmm2/m128	AES-NI	AES-NI	Performs one round of AES encryption flow.
aesenc128kl	AESENC128KL m128, xmm	Legacy	KEYLOCKER	Encrypts data using Key Locker handle.
aesenc256kl	AESENC256KL m128, xmm	Legacy	KEYLOCKER	Encrypts data using 256-bit Key Locker handle.
aesenclast	AESENCLAST xmm1, xmm2/m128	AES-NI	AES-NI	Performs the last round of AES encryption.
aesencwide128kl	AESENCWIDE128KL m128	Legacy	KEYLOCKER_WIDE	Encrypts 8 blocks using 128-bit Key Locker handle.
aesencwide256kl	AESENCWIDE256KL m128	Legacy	KEYLOCKER_WIDE	Encrypts 8 blocks using 256-bit Key Locker handle.
aesimc	AESIMC xmm1, xmm2/m128	AES-NI	AES-NI	Performs AES InvMixColumns transformation (decryption helper).
aeskeygenassist	AESKEYGENASSIST xmm1, xmm2/m128, imm8	AES-NI	AES-NI	Generates round key for AES encryption.
and	AND r/m, r	Legacy	Base	Performs bitwise AND.
andn	ANDN r32, r32, r/m32	VEX	BMI1	Calculates (NOT src1) AND src2. Non-destructive.
andpd	ANDPD xmm, xmm/m128	SSE2	SSE2	Bitwise AND of 128 bits.
andps	ANDPS xmm, xmm/m128	SSE	SSE	Bitwise AND of 128 bits.
aor	AOR m32, r32	VEX	RAO-INT	Atomically ORs a value to a remote memory location.
arpl	ARPL r/m16, r16	System	System (32-bit)	Adjusts RPL of selector to match current CPL (Legacy).
axor	AXOR m32, r32	VEX	RAO-INT	Atomically XORs a value to a remote memory location.
bextr	BEXTR r32, r/m32, r32	VEX	BMI1	Extracts sequence of bits from source using index/length.
blcfill	BLCFILL r32, r/m32	TBM	TBM	Sets all bits below the lowest clear bit (x & (x+1)).
blci	BLCI r32, r/m32	TBM	TBM	Sets all bits to 0 except the lowest set bit inverted (x \| ~(x+1)).
blcic	BLCIC r32, r/m32	TBM	TBM	Isolates lowest clear bit (~x & (x+1)).
blcmsk	BLCMSK r32, r/m32	TBM	TBM	Creates mask from lowest clear bit (x ^ (x+1)).
blcs	BLCS r32, r/m32	TBM	TBM	Sets lowest clear bit (x \| (x+1)).
blendpd	BLENDPD xmm1, xmm2/m128, imm8	SSE4.1	SSE4.1	Selects doubles from two sources based on immediate mask.
blendps	BLENDPS xmm1, xmm2/m128, imm8	SSE4.1	SSE4.1	Selects floats from two sources based on immediate mask.
blendvpd	BLENDVPD xmm1, xmm2/m128, <XMM0>	SSE4.1	SSE4.1	Blends doubles based on variable mask in XMM0.
blendvps	BLENDVPS xmm1, xmm2/m128, <XMM0>	SSE4.1	SSE4.1	Blends floats based on variable mask in XMM0.
blsfill	BLSFILL r32, r/m32	TBM	TBM	Sets all bits below lowest set bit ((x-1) \| x).
blsi	BLSI r32, r/m32	VEX	BMI1	Extracts the lowest set bit (x & -x).
blsic	BLSIC r32, r/m32	TBM	TBM	Isolates lowest set bit and complements (~x \| (x-1)).
blsmsk	BLSMSK r32, r/m32	VEX	BMI1	Creates mask up to lowest set bit (x ^ (x-1)).
blsr	BLSR r32, r/m32	VEX	BMI1	Clears the lowest set bit (x & (x-1)).
bndcl	BNDCL b, r/m	Legacy	MPX	Checks if address is within lower bound.
bndcu	BNDCU b, r/m	Legacy	MPX	Checks if address is within upper bound.
bndmk	BNDMK b, m	Legacy	MPX	Creates bounds data for MPX.
bndmov	BNDMOV b, b/m	Legacy	MPX	Moves MPX bounds data.
bound	BOUND r, m	Legacy	Base (32-bit only)	Checks if operand is within bounds defined in memory.
bsf	BSF r, r/m	Legacy	Base	Scans for LSB set to 1.
bsr	BSR r, r/m	Legacy	Base	Scans for MSB set to 1.
bswap	BSWAP r32	Legacy	Base	Reverses the byte order of a register (Endian swap).
bswap	BSWAP r	Legacy	Base	Reverses the byte order of a 32/64-bit register.
bt	BT r/m, r	Legacy	Base	Selects a bit and stores it in CF.
btc	BTC r/m, r	Legacy	Base	Stores bit in CF and complements the bit.
btr	BTR r/m, r	Legacy	Base	Stores bit in CF and clears bit to 0.
bts	BTS r/m, r	Legacy	Base	Stores bit in CF and sets bit to 1.
bzhi	BZHI r32, r/m32, r32	VEX	BMI2	Clears high bits starting at index.
call	CALL rel	Legacy	Base	Push EIP/RIP and jump to target.
cbw	CBW	Legacy	Base	Sign-extends AL into AX.
clac	CLAC	Legacy	SMAP	Clears Alignment Check flag (SMAP prevention).
clc	CLC	Legacy	Base	Sets the CF flag to 0.
cld	CLD	Legacy	Base	Sets DF to 0 (String operations increment).
cldemote	CLDEMOTE m8	Legacy	CLDEMOTE	Hints to move cache line to lower cache level.
clflush	CLFLUSH m8	SSE2	SSE2	Flushes the cache line containing the operand from all caches.
clflushopt	CLFLUSHOPT m8	Legacy	CLFLUSHOPT	Optimized version of CLFLUSH (Higher throughput).
clgi	CLGI	SVM	SVM	Disables global interrupts (AMD SVM).
cli	CLI	Legacy	Base	Disables maskable hardware interrupts.
clrssbsy	CLRSSBSY m64	Legacy	CET-SS	Clears the busy flag in the shadow stack token.
cltd	CLTD	Legacy	Base	Sign-extends EAX into EDX:EAX (also CDQ).
clts	CLTS	System	System	Clears the TS flag in CR0 (Privileged).
clui	CLUI	Legacy	UINTR	Clears the User Interrupt Flag (UIF).
clwb	CLWB m8	Legacy	CLWB	Writes back modified cache line without flushing (Persistent Memory).
clzero	CLZERO	AMD	CLZERO	Clears the cache line at address RAX/EAX (AMD).
cmc	CMC	Legacy	Base	Toggles the CF flag.
cmovcc	CMOVcc r, r/m	Legacy	CMOV	Moves data if condition code is met (e.g., CMOVE, CMOVNE).
cmovg	CMOVG r, r/m	Legacy	CMOV	Move if ZF=0 and SF=OF.
cmovge	CMOVGE r, r/m	Legacy	CMOV	Move if SF=OF.
cmovl	CMOVL r, r/m	Legacy	CMOV	Move if SF!=OF.
cmovle	CMOVLE r, r/m	Legacy	CMOV	Move if ZF=1 or SF!=OF.
cmovnz	CMOVNZ r, r/m	Legacy	CMOV	Move if ZF=0.
cmovz	CMOVZ r, r/m	Legacy	CMOV	Move if ZF=1.
cmp	CMP r/m, r	Legacy	Base	Subtracts src from dest and updates flags (dest not modified).
cmpccxadd	CMPccXADD m32, r32, r32	EVEX	CMPccXADD	Atomically adds if condition is met.
cmps	CMPSB	Legacy	Base	Compares byte/word at [ESI] with [EDI].
cmpsd	CMPSD xmm1, xmm2/m64, imm8	SSE2	SSE2	Compares low double-precision values and returns mask.
cmpsd	CMPSD	Legacy	Base	Compares doubleword at [ESI] with [EDI].
cmpsq	CMPSQ	Legacy	Base (64-bit)	Compares quadword at [RSI] with [RDI].
cmpss	CMPSS xmm1, xmm2/m32, imm8	SSE	SSE	Compares low single-precision values and returns mask.
cmpsw	CMPSW	Legacy	Base	Compares word at [ESI] with [EDI].
cmpxchg	CMPXCHG r/m, r	Legacy	Base	Compares accumulator with dest; if equal, dest = src; else accumulator = dest.
cmpxchg16b	CMPXCHG16B m128	Base (64-bit)	Base (64-bit)	Atomically compares 128-bit memory with RDX:RAX.
cmpxchg8b	CMPXCHG8B m64	Legacy	Base	Atomically compares EDX:EAX with memory; swaps if equal.
comisd	COMISD xmm1, xmm2/m64	SSE2	SSE2	Compares low double and sets EFLAGS (Signaling NaN raises exception).
comiss	COMISS xmm1, xmm2/m32	SSE	SSE	Compares low float and sets EFLAGS (Signaling NaN raises exception).
cpuid	CPUID	Legacy	Base	Returns processor information based on EAX value.
cqto	CQTO	Legacy	Base (64-bit)	Sign-extends RAX into RDX:RAX (also CQO).
crc32	CRC32 r32, r/m	SSE4.2	SSE4.2	Accumulates CRC32C value using polynomial 0x11EDC6F41.
cvtdq2pd	CVTDQ2PD xmm1, xmm2/m64	SSE2	SSE2	Converts two 32-bit integers to two 64-bit doubles.
cvtdq2ps	CVTDQ2PS xmm1, xmm2/m128	SSE2	SSE2	Converts four 32-bit integers to floats.
cvtpd2dq	CVTPD2DQ xmm1, xmm2/m128	SSE2	SSE2	Converts two doubles to two 32-bit integers (Rounded).
cvtpd2ps	CVTPD2PS xmm1, xmm2/m128	SSE2	SSE2	Converts two doubles to two floats.
cvtps2dq	CVTPS2DQ xmm1, xmm2/m128	SSE2	SSE2	Converts four floats to 32-bit integers (Rounded).
cvtps2pd	CVTPS2PD xmm1, xmm2/m64	SSE2	SSE2	Converts lower two floats to doubles.
cvtsd2si	CVTSD2SI r32, xmm/m64	SSE2	SSE2	Converts low double to integer (Rounded according to MXCSR).
cvtsd2sq	CVTSD2SQ r64, xmm/m64	SSE2	Base (64-bit)	Converts double to 64-bit integer (Rounded).
cvtsd2ss	CVTSD2SS xmm, xmm/m64	SSE2	SSE2	Converts double to float.
cvtsi2sd	CVTSI2SD xmm, r/m32	SSE2	SSE2	Converts 32-bit int to double.
cvtsi2ss	CVTSI2SS xmm, r/m32	SSE	SSE	Converts 32-bit int to float.
cvtsq2sd	CVTSQ2SD xmm1, r/m64	SSE2	Base (64-bit)	Converts 64-bit integer to double.
cvtsq2ss	CVTSQ2SS xmm1, r/m64	SSE	Base (64-bit)	Converts 64-bit integer to float.
cvtss2sd	CVTSS2SD xmm, xmm/m32	SSE2	SSE2	Converts float to double.
cvtss2si	CVTSS2SI r32, xmm/m32	SSE	SSE	Converts low float to integer (Rounded according to MXCSR).
cvtss2sq	CVTSS2SQ r64, xmm/m32	SSE	Base (64-bit)	Converts float to 64-bit integer (Rounded).
cvttpd2dq	CVTTPD2DQ xmm1, xmm2/m128	SSE2	SSE2	Converts two doubles to two 32-bit integers (Truncated).
cvttps2dq	CVTTPS2DQ xmm1, xmm2/m128	SSE2	SSE2	Converts four floats to 32-bit integers (Truncated).
cvttps2pi	CVTTPS2PI mm, xmm/m64	SSE	SSE	Converts packed floats to packed MMX integers (Truncate).
cvttsd2si	CVTTSD2SI r32, xmm/m64	SSE2	SSE2	Converts double to 32-bit int (Truncate).
cvttsd2sq	CVTTSD2SQ r64, xmm/m64	SSE2	Base (64-bit)	Converts double to 64-bit integer (Truncated).
cvttss2si	CVTTSS2SI r32, xmm/m32	SSE	SSE	Converts float to 32-bit int (Truncate).
cvttss2sq	CVTTSS2SQ r64, xmm/m32	SSE	Base (64-bit)	Converts float to 64-bit integer (Truncated).
cwd	CWD	Legacy	Base	Sign-extends AX into DX:AX.
cwtl	CWTL	Legacy	Base	Sign-extends AX into EAX (also CWDE).
daa	DAA	Legacy	Base (Legacy)	Adjusts AL after addition for packed BCD.
das	DAS	Legacy	Base (Legacy)	Adjusts AL after subtraction for packed BCD.
dec	DEC r/m	Legacy	Base	Decrements the operand by 1.
div	DIV r/m	Legacy	Base	Unsigned divide (AX / src).
divpd	DIVPD xmm, xmm/m128	SSE2	SSE2	Divides two 64-bit doubles.
divps	DIVPS xmm, xmm/m128	SSE	SSE	Divides four 32-bit floats.
divsd	DIVSD xmm1, xmm2/m64	SSE2	SSE2	Divides the low double-precision floating-point value.
divss	DIVSS xmm1, xmm2/m32	SSE	SSE	Divides the low single-precision floating-point value.
dppd	DPPD xmm1, xmm2/m128, imm8	SSE4.1	SSE4.1	Computes the dot product of two double vectors.
dpps	DPPS xmm1, xmm2/m128, imm8	SSE4.1	SSE4.1	Computes the dot product of two float vectors.
emms	EMMS	MMX	MMX	Clears the FPU tag word to allow FP instructions after MMX.
encls	ENCLS	Legacy	SGX	Executes an SGX supervisor function specified by EAX.
enclu	ENCLU	Legacy	SGX	Executes an SGX user function specified by EAX.
encodekey128	ENCODEKEY128 r32, r32	Legacy	KEYLOCKER	Wraps a 128-bit AES key into a handle.
encodekey256	ENCODEKEY256 r32, r32	Legacy	KEYLOCKER	Wraps a 256-bit AES key into a handle.
endbr32	ENDBR32	Legacy	CET-IBT	Marker instruction for Indirect Branch Tracking (IBT).
endbr64	ENDBR64	Legacy	CET-IBT	Marker instruction for Indirect Branch Tracking (IBT).
enqcmd	ENQCMD r32, m512	Legacy	ENQCMD	Writes a command to a device (DSA/IAA accelerator).
enqcmds	ENQCMDS r32, m512	Legacy	ENQCMD	Writes a command to a device (Supervisor mode).
enter	ENTER imm16, imm8	Legacy	Base	Creates a stack frame for procedure parameters.
erets	ERETS	Legacy	FRED	Returns from an event handler to supervisor mode (FRED).
eretu	ERETU	Legacy	FRED	Returns from an event handler to user mode (FRED).
extractps	EXTRACTPS r32/m32, xmm1, imm8	SSE4.1	SSE4.1	Extracts a single float from XMM to an integer register.
extrq	EXTRQ xmm1, xmm2	SSE4a	SSE4a	Extracts bit field from register (AMD SSE4a).
f2xm1	F2XM1	Legacy	x87 FPU	Computes (2^ST(0)) - 1.
fabs	FABS	Legacy	x87 FPU	Replaces ST(0) with its absolute value.
fadd	FADD m32fp/m64fp	Legacy	x87 FPU	Adds src to dest (ST(0) += src).
fchs	FCHS	Legacy	x87 FPU	Reverses the sign of ST(0).
fclex	FCLEX	Legacy	x87 FPU	Clears floating-point exception flags.
fcmovb	FCMOVB ST(0), ST(i)	Legacy	x87 FPU (P6+)	Moves ST(i) to ST(0) if CF=1.
fcmovbe	FCMOVBE ST(0), ST(i)	Legacy	x87 FPU (P6+)	Moves ST(i) to ST(0) if CF=1 or ZF=1.
fcmove	FCMOVE ST(0), ST(i)	Legacy	x87 FPU (P6+)	Moves ST(i) to ST(0) if ZF=1.
fcmovnb	FCMOVNB ST(0), ST(i)	Legacy	x87 FPU (P6+)	Moves ST(i) to ST(0) if CF=0.
fcmovnbe	FCMOVNBE ST(0), ST(i)	Legacy	x87 FPU (P6+)	Moves ST(i) to ST(0) if CF=0 and ZF=0.
fcmovne	FCMOVNE ST(0), ST(i)	Legacy	x87 FPU (P6+)	Moves ST(i) to ST(0) if ZF=0.
fcmovnu	FCMOVNU ST(0), ST(i)	Legacy	x87 FPU (P6+)	Moves ST(i) to ST(0) if PF=0.
fcmovu	FCMOVU ST(0), ST(i)	Legacy	x87 FPU (P6+)	Moves ST(i) to ST(0) if PF=1.
fcom	FCOM m32fp/m64fp	Legacy	x87 FPU	Compares ST(0) with source.
fcomi	FCOMI ST(0), ST(i)	Legacy	x87 FPU (P6+)	Compares ST(0) with ST(i) and sets CPU EFLAGS directly.
fcos	FCOS	Legacy	x87 FPU	Computes cosine of ST(0) (in radians).
fdecstp	FDECSTP	Legacy	x87 FPU	Decrements the TOP field in the FPU status word.
fdiv	FDIV m32fp/m64fp	Legacy	x87 FPU	Divides dest by src.
ffree	FFREE ST(i)	Legacy	x87 FPU	Sets the tag for ST(i) to empty.
fild	FILD m16int/m32int/m64int	Legacy	x87 FPU	Converts integer in memory to double-extended-precision float and pushes to ST(0).
fincstp	FINCSTP	Legacy	x87 FPU	Increments the TOP field in the FPU status word.
finit	FINIT	Legacy	x87 FPU	Resets FPU to default state.
fist	FIST m16int/m32int	Legacy	x87 FPU	Converts ST(0) to integer and stores in memory.
fistp	FISTP m16int/m32int/m64int	Legacy	x87 FPU	Converts ST(0) to integer, stores in memory, and pops stack.
fld	FLD m32fp/m64fp/m80fp	Legacy	x87 FPU	Pushes a floating-point value onto the FPU register stack (ST0).
fld1	FLD1	Legacy	x87 FPU	Pushes +1.0 onto the FPU register stack.
fldcw	FLDCW m2byte	Legacy	x87 FPU	Loads FPU control word from memory.
fldl2e	FLDL2E	Legacy	x87 FPU	Pushes log2(e) onto the FPU register stack.
fldl2t	FLDL2T	Legacy	x87 FPU	Pushes log2(10) onto the FPU register stack.
fldlg2	FLDLG2	Legacy	x87 FPU	Pushes log10(2) onto the FPU register stack.
fldln2	FLDLN2	Legacy	x87 FPU	Pushes ln(2) onto the FPU register stack.
fldpi	FLDPI	Legacy	x87 FPU	Pushes Pi onto the FPU register stack.
fldz	FLDZ	Legacy	x87 FPU	Pushes +0.0 onto the FPU register stack.
fmul	FMUL m32fp/m64fp	Legacy	x87 FPU	Multiplies dest by src.
fpatan	FPATAN	Legacy	x87 FPU	Computes arctan(ST(1)/ST(0)).
fprem	FPREM	Legacy	x87 FPU	Computes remainder of ST(0) / ST(1).
fptan	FPTAN	Legacy	x87 FPU	Computes tangent of ST(0) and pushes 1.0.
frndint	FRNDINT	Legacy	x87 FPU	Rounds ST(0) to integer according to RC field.
frstor	FRSTOR m108byte	Legacy	x87 FPU	Loads FPU state from memory.
fsave	FSAVE m108byte	Legacy	x87 FPU	Stores FPU state to memory and re-initializes FPU.
fscale	FSCALE	Legacy	x87 FPU	Scales ST(0) by ST(1) (ST(0) * 2^ST(1)).
fsin	FSIN	Legacy	x87 FPU	Computes sine of ST(0) (in radians).
fsincos	FSINCOS	Legacy	x87 FPU	Computes sine and cosine of ST(0), pushing both to stack.
fsqrt	FSQRT	Legacy	x87 FPU	Computes square root of ST(0).
fst	FST m32fp/m64fp	Legacy	x87 FPU	Copies the value in ST(0) to memory or another register.
fstcw	FSTCW m2byte	Legacy	x87 FPU	Stores FPU control word to memory.
fstp	FSTP m32fp/m64fp/m80fp	Legacy	x87 FPU	Copies ST(0) to destination and pops the register stack.
fstsw	FSTSW AX	Legacy	x87 FPU	Stores FPU status word to AX or memory.
fsub	FSUB m32fp/m64fp	Legacy	x87 FPU	Subtracts src from dest.
fucom	FUCOM ST(i)	Legacy	x87 FPU	Compares ST(0) with source (supports NaNs).
fxch	FXCH ST(i)	Legacy	x87 FPU	Exchanges contents of ST(0) and ST(i).
fxtract	FXTRACT	Legacy	x87 FPU	Separates exponent and significand of ST(0).
fyl2x	FYL2X	Legacy	x87 FPU	Computes ST(1) * log2(ST(0)).
fyl2xp1	FYL2XP1	Legacy	x87 FPU	Computes ST(1) * log2(ST(0) + 1).
getsec	GETSEC	Legacy	SMX	Entry point for Safer Mode Extensions (Trusted Execution).
gf2p8affineinvqb	GF2P8AFFINEINVQB xmm1, xmm2/m128, imm8	VEX	GFNI	Computes inverse affine transformation in GF(2^8).
gf2p8affineqb	GF2P8AFFINEQB xmm1, xmm2/m128, imm8	VEX	GFNI	Computes affine transformation in GF(2^8).
gf2p8mulb	GF2P8MULB xmm1, xmm2/m128	VEX	GFNI	Multiplies bytes in GF(2^8).
haddpd	HADDPD xmm1, xmm2/m128	SSE3	SSE3	Adds adjacent double-precision elements horizontally.
haddps	HADDPS xmm1, xmm2/m128	SSE3	SSE3	Adds adjacent float elements horizontally.
hlt	HLT	Legacy	Base	Stops instruction execution and places processor in HALT state.
hreset	HRESET imm8	Legacy	HRESET	Resets processor history (prediction) structures.
hsubpd	HSUBPD xmm1, xmm2/m128	SSE3	SSE3	Subtracts adjacent double-precision elements horizontally.
hsubps	HSUBPS xmm1, xmm2/m128	SSE3	SSE3	Subtracts adjacent single-precision elements horizontally.
idiv	IDIV r/m	Legacy	Base	Signed divide (AX / src).
imul	IMUL r, r/m	Legacy	Base	Signed multiply.
in	IN AL, imm8	Legacy	Base	Reads data from an I/O port into AL/AX/EAX.
in	IN AL/AX/EAX, DX	Legacy	Base	Reads data from I/O port specified in DX.
inc	INC r/m	Legacy	Base	Increments the operand by 1.
incsspq	INCSSPQ r64	Legacy	CET-SS	Adjusts the shadow stack pointer.
ins	INSB	Legacy	Base	Reads string from I/O port to memory at [EDI].
insd	INSD	Legacy	Base	Reads doubleword from I/O port to memory at [EDI].
insertps	INSERTPS xmm1, xmm2/m32, imm8	SSE4.1	SSE4.1	Inserts a single float into a specific index of XMM.
insertq	INSERTQ xmm1, xmm2	SSE4a	SSE4a	Inserts bit field into register (AMD SSE4a).
insw	INSW	Legacy	Base	Reads word from I/O port to memory at [EDI].
int	INT imm8	Legacy	Base	Calls to interrupt procedure.
int1	INT1	Legacy	Base	Single byte opcode (0xF1) used for In-Circuit Emulation.
int3	INT3	Legacy	Base	Calls to interrupt vector 3 (Debugger breakpoint).
invd	INVD	System	System	Flushes internal caches without writing back data (Privileged).
invept	INVEPT r64, m128	VMX	VMX (EPT)	Invalidates Extended Page Table entries.
invlpg	INVLPG m	System	System	Invalidates a specific TLB entry (Privileged).
invlpga	INVLPGA	SVM	SVM	Invalidates TLB entry for specific ASID (AMD SVM).
invpcid	INVPCID r32, m128	Legacy	INVPCID	Invalidates TLB entries based on PCID.
invvpid	INVVPID r64, m128	VMX	VMX (VPID)	Invalidates TLB entries based on Virtual Processor ID.
iret	IRET	Legacy	Base	Returns from an interrupt, exception, or task handler.
iretd	IRETD	Legacy	Base	Returns from interrupt (32-bit operand size).
iretq	IRETQ	Legacy	Base (64-bit)	Returns from interrupt (64-bit operand size).
ja	JA rel	Legacy	Base	Jump if CF=0 and ZF=0 (Unsigned >).
jb	JB rel	Legacy	Base	Jump if CF=1 (Unsigned <).
je	JE rel	Legacy	Base	Jump if ZF=1 (Same as JZ).
jecxz	JECXZ rel	Legacy	Base	Jumps if ECX register is 0.
jg	JG rel	Legacy	Base	Jump if ZF=0 and SF=OF (Signed >).
jl	JL rel	Legacy	Base	Jump if SF!=OF (Signed <).
jmp	JMP rel	Legacy	Base	Unconditional jump to target.
jne	JNE rel	Legacy	Base	Jump if ZF=0 (Same as JNZ).
jno	JNO rel	Legacy	Base	Jump near if overflow flag is 0.
jnp	JNP rel	Legacy	Base	Jump near if parity flag is 0 (Odd parity).
jns	JNS rel	Legacy	Base	Jump near if sign flag is 0 (Positive).
jo	JO rel	Legacy	Base	Jump near if overflow flag is 1.
jp	JP rel	Legacy	Base	Jump near if parity flag is 1 (Even parity).
js	JS rel	Legacy	Base	Jump near if sign flag is 1 (Negative).
kaddb	KADDB k1, k2, k3	EVEX	AVX-512DQ	Adds two 8-bit mask registers.
kaddw	KADDW k1, k2, k3	EVEX	AVX-512DQ	Adds two 16-bit mask registers.
kandnw	KANDNW k1, k2, k3	EVEX	AVX-512	Bitwise AND NOT of 16-bit masks.
kandq	KANDQ k1, k2, k3	EVEX	AVX-512BW	Bitwise AND of 64-bit mask registers.
kandw	KANDW k1, k2, k3	EVEX	AVX-512	Bitwise AND of 16-bit masks.
kmovq	KMOVQ k1, k2/m64	EVEX	AVX-512BW	Moves 64-bit mask to/from k-register.
kmovw	KMOVW k1, k2/m16	EVEX	AVX-512	Moves 16-bit mask to/from k-register.
knotb	KNOTB k1, k2	EVEX	AVX-512DQ	Bitwise NOT of 8-bit mask.
knotd	KNOTD k1, k2	EVEX	AVX-512BW	Bitwise NOT of 32-bit mask.
knotq	KNOTQ k1, k2	EVEX	AVX-512BW	Bitwise NOT of 64-bit mask register.
knotw	KNOTW k1, k2	EVEX	AVX-512	Bitwise NOT of 16-bit mask.
korb	KORB k1, k2, k3	EVEX	AVX-512DQ	Bitwise OR of 8-bit masks.
kord	KORD k1, k2, k3	EVEX	AVX-512BW	Bitwise OR of 32-bit masks.
korq	KORQ k1, k2, k3	EVEX	AVX-512BW	Bitwise OR of 64-bit mask registers.
kortestb	KORTESTB k1, k2	EVEX	AVX-512DQ	ORs 8-bit masks and sets EFLAGS (ZF/CF).
kortestq	KORTESTQ k1, k2	EVEX	AVX-512BW	ORs 64-bit masks and sets EFLAGS (ZF/CF).
kortestw	KORTESTW k1, k2	EVEX	AVX-512	ORs two masks and sets EFLAGS (ZF, CF) based on result.
korw	KORW k1, k2, k3	EVEX	AVX-512	Bitwise OR of 16-bit masks.
kshiftlb	KSHIFTLB k1, k2, imm8	EVEX	AVX-512DQ	Logically shifts 8-bit mask left.
kshiftld	KSHIFTLD k1, k2, imm8	EVEX	AVX-512BW	Logically shifts 32-bit mask left.
kshiftlq	KSHIFTLQ k1, k2, imm8	EVEX	AVX-512BW	Logically shifts 64-bit mask left.
kshiftlw	KSHIFTLW k1, k2, imm8	EVEX	AVX-512F	Logically shifts 16-bit mask left.
kshiftrb	KSHIFTRB k1, k2, imm8	EVEX	AVX-512DQ	Logically shifts 8-bit mask right.
kshiftrd	KSHIFTRD k1, k2, imm8	EVEX	AVX-512BW	Logically shifts 32-bit mask right.
kshiftrq	KSHIFTRQ k1, k2, imm8	EVEX	AVX-512BW	Logically shifts 64-bit mask right.
kshiftrw	KSHIFTRW k1, k2, imm8	EVEX	AVX-512F	Logically shifts 16-bit mask right.
ktestb	KTESTB k1, k2	EVEX	AVX-512DQ	ANDs 8-bit masks and sets EFLAGS (ZF/CF).
ktestd	KTESTD k1, k2	EVEX	AVX-512BW	ANDs 32-bit masks and sets EFLAGS (ZF/CF).
ktestq	KTESTQ k1, k2	EVEX	AVX-512BW	ANDs 64-bit masks and sets EFLAGS (ZF/CF).
ktestw	KTESTW k1, k2	EVEX	AVX-512F	ANDs 16-bit masks and sets EFLAGS (ZF/CF).
kunpckbw	KUNPCKBW k1, k2, k3	EVEX	AVX-512	Interleaves 8-bit masks into 16-bit mask.
kunpckdq	KUNPCKDQ k1, k2, k3	EVEX	AVX-512BW	Interleaves 32-bit masks into 64-bit mask.
kunpckwd	KUNPCKWD k1, k2, k3	EVEX	AVX-512BW	Interleaves 16-bit masks into 32-bit mask.
kxorb	KXORB k1, k2, k3	EVEX	AVX-512DQ	Bitwise XOR of 8-bit masks.
kxord	KXORD k1, k2, k3	EVEX	AVX-512BW	Bitwise XOR of 32-bit masks.
kxorq	KXORQ k1, k2, k3	EVEX	AVX-512BW	Bitwise XOR of 64-bit masks.
kxorw	KXORW k1, k2, k3	EVEX	AVX-512	Bitwise XOR of 16-bit masks.
lahf	LAHF	Legacy	Base	Loads bits 0, 2, 4, 6, and 7 of EFLAGS into AH.
lar	LAR r, r/m16	System	System	Reads access rights from segment descriptor.
lddqu	LDDQU xmm1, m128	SSE3	SSE3	Loads unaligned data avoiding split-line penalties.
ldmxcsr	LDMXCSR m32	SSE	SSE	Loads the MXCSR control/status register from memory.
lds	LDS r, m	Legacy	Base (Legacy)	Loads pointer into DS and register.
ldtilecfg	LDTILECFG m512	VEX	AMX-TILE	Loads AMX tile configuration from memory.
lea	LEA r, m	Legacy	Base	Computes effective address and stores in register.
leave	LEAVE	Legacy	Base	Releases stack frame (MOV ESP, EBP; POP EBP).
les	LES r, m	Legacy	Base (Legacy)	Loads pointer into ES and register.
lfence	LFENCE	SSE2	SSE2	Serializes load operations (Wait for prior loads to complete).
lfs	LFS r, m	Legacy	Base	Loads pointer into FS and register.
lgdt	LGDT m16&32	System	System	Loads the GDT register (Privileged).
lgs	LGS r, m	Legacy	Base	Loads pointer into GS and register.
lidt	LIDT m16&32	System	System	Loads the IDT register (Privileged).
lkgs	LKGS r16	Legacy	LKGS	Loads the kernel GS base address (FRED support).
lldt	LLDT r/m16	System	System	Loads LDT segment selector (Privileged).
lmsw	LMSW r/m16	System	System	Loads Machine Status Word (Legacy CR0 modification).
loadiwkey	LOADIWKEY xmm1, xmm2	Legacy	KEYLOCKER	Loads the Key Locker internal wrapping key.
lods	LODSB	Legacy	Base	Loads byte/word/dword from [ESI] into AL/AX/EAX.
lodsd	LODSD	Legacy	Base	Loads doubleword from [ESI] into EAX.
lodsq	LODSQ	Legacy	Base (64-bit)	Loads quadword from [RSI] into RAX.
lodsw	LODSW	Legacy	Base	Loads word from [ESI] into AX.
loop	LOOP rel	Legacy	Base	Decrements ECX/RCX and jumps if not zero.
loope	LOOPE rel	Legacy	Base	Decrements count; jumps if count!=0 and ZF=1.
loopne	LOOPNE rel	Legacy	Base	Decrements count; jumps if count!=0 and ZF=0.
lsl	LSL r, r/m16	System	System	Reads segment limit from descriptor.
lss	LSS r, m	Legacy	Base	Loads pointer into SS and register.
ltr	LTR r/m16	System	System	Loads Task Register (Privileged).
lzcnt	LZCNT r, r/m	VEX	ABM/BMI	Counts number of leading zeros.
maskmovdqu	MASKMOVDQU xmm, xmm	SSE2	SSE2	Non-temporal store of selected bytes (masked).
maskmovq	MASKMOVQ mm1, mm2	MMX	MMX	Non-temporal store of selected MMX bytes.
maxps	MAXPS xmm, xmm/m128	SSE	SSE	Returns maximum of packed floats.
maxsd	MAXSD xmm1, xmm2/m64	SSE2	SSE2	Returns the maximum of two low double-precision values.
maxss	MAXSS xmm1, xmm2/m32	SSE	SSE	Returns the maximum of two low single-precision values.
mfence	MFENCE	SSE2	SSE2	Serializes all load and store operations.
minps	MINPS xmm, xmm/m128	SSE	SSE	Returns minimum of packed floats.
minsd	MINSD xmm1, xmm2/m64	SSE2	SSE2	Returns the minimum of two low double-precision values.
minss	MINSS xmm1, xmm2/m32	SSE	SSE	Returns the minimum of two low single-precision values.
monitor	MONITOR	Legacy	SSE3	Sets up a linear address range to be monitored.
monitorx	MONITORX	AMD	AMD	Sets up a monitor address (AMD extension).
mov	MOV r/m, r	Legacy	Base	Copies data from source to destination.
mov cr	MOV CRn, r	System	System	Moves data to/from Control Registers (CR0, CR3, etc.) (Privileged).
mov dr	MOV DRn, r	System	System	Moves data to/from Debug Registers (DR0-DR7) (Privileged).
movapd	MOVAPD xmm, xmm/m128	SSE2	SSE2	Moves 128-bit packed double data (Must be 16-byte aligned).
movaps	MOVAPS xmm, xmm/m128	SSE	SSE	Moves 128-bit packed float data (Must be 16-byte aligned).
movbe	MOVBE r, m	Legacy	MOVBE	Moves data swapping bytes (Big Endian load/store).
movd	MOVD mm/xmm, r32/m32	SSE	MMX/SSE2	Moves 32 bits between GPR and XMM/MMX register.
movddup	MOVDDUP xmm1, xmm2/m64	SSE3	SSE3	Loads 64-bit double and duplicates it to fill 128-bit register.
movdir64b	MOVDIR64B m512, m512	Legacy	MOVDIR64B	Atomically moves 64-byte block avoiding cache pollution.
movdiri	MOVDIRI m, r	Legacy	MOVDIRI	Moves 32/64-bit data avoiding cache pollution (Direct IO).
movdqa	MOVDQA xmm, xmm/m128	SSE2	SSE2	Moves 128-bit integer data (Aligned).
movdqu	MOVDQU xmm, xmm/m128	SSE2	SSE2	Moves 128-bit integer data (Unaligned).
movmskpd	MOVMSKPD r32, xmm	SSE2	SSE2	Extracts sign bits from two doubles into low 2 bits of register.
movmskps	MOVMSKPS r32, xmm	SSE	SSE	Extracts sign bits from four floats into low 4 bits of register.
movntdqa	MOVNTDQA xmm1, m128	SSE4.1	SSE4.1	Efficiently loads 128-bits from WC memory (Streaming Load).
movnti	MOVNTI m32, r32	SSE2	SSE2	Stores integer register to memory bypassing cache.
movntpd	MOVNTPD m128, xmm	SSE2	SSE2	Stores double vectors directly to RAM, bypassing cache.
movntps	MOVNTPS m128, xmm	SSE	SSE	Stores float vectors directly to RAM, bypassing cache.
movntq	MOVNTQ m64, mm	SSE	SSE	Stores 64-bit MMX data bypassing cache.
movntsd	MOVNTSD m64, xmm1	SSE4a	SSE4a	Stores scalar double bypassing cache (AMD SSE4a).
movntss	MOVNTSS m32, xmm1	SSE4a	SSE4a	Stores scalar float bypassing cache (AMD SSE4a).
movq	MOVQ mm, mm/m64	MMX	MMX	Moves 64-bit data between MMX registers/memory.
movq	MOVQ xmm, xmm/m64	SSE2	SSE2	Moves 64 bits between XMM registers or memory.
movsd	MOVSD xmm1, xmm2/m64	SSE2	SSE2	Moves a single double (low 64 bits) between XMM/Memory.
movsd	MOVSD	Legacy	Base	Moves doubleword from [ESI] to [EDI].
movshdup	MOVSHDUP xmm1, xmm2/m128	SSE3	SSE3	Duplicates high element of each qword pair.
movsldup	MOVSLDUP xmm1, xmm2/m128	SSE3	SSE3	Duplicates low element of each qword pair.
movsq	MOVSQ	Legacy	Base (64-bit)	Moves quadword from [RSI] to [RDI].
movss	MOVSS xmm1, xmm2/m32	SSE	SSE	Moves a single float (low 32 bits) between XMM/Memory.
movsw	MOVSW	Legacy	Base	Moves word from [ESI] to [EDI].
movsx	MOVSX r, r/m	Legacy	Base	Copies and sign-extends a smaller value to a larger register.
movsxd	MOVSXD r64, r/m32	Base (64-bit)	Base (64-bit)	Sign-extends 32-bit register to 64-bit.
movupd	MOVUPD xmm, xmm/m128	SSE2	SSE2	Moves 128-bit packed double data (Unaligned).
movups	MOVUPS xmm, xmm/m128	SSE	SSE	Moves 128-bit packed float data (Unaligned).
movzx	MOVZX r, r/m	Legacy	Base	Copies and zero-extends a smaller value to a larger register.
mpsadbw	MPSADBW xmm1, xmm2/m128, imm8	SSE4.1	SSE4.1	Computes multiple SADs of byte blocks.
mul	MUL r/m	Legacy	Base	Unsigned multiply (AX = AL * src).
mulpd	MULPD xmm, xmm/m128	SSE2	SSE2	Multiplies two 64-bit doubles.
mulps	MULPS xmm, xmm/m128	SSE	SSE	Multiplies four 32-bit floats.
mulsd	MULSD xmm1, xmm2/m64	SSE2	SSE2	Multiplies the low double-precision floating-point value.
mulss	MULSS xmm1, xmm2/m32	SSE	SSE	Multiplies the low single-precision floating-point value.
mulx	MULX r32, r32, r/m32	VEX	BMI2	Unsigned multiply of RDX * Src. Result in Hi:Lo. No flags.
mwait	MWAIT	Legacy	SSE3	Waits for a write to a monitored address.
mwaitx	MWAITX	AMD	AMD	Waits for a write to monitored address (AMD extension).
neg	NEG r/m	Legacy	Base	Negates value (0 - operand).
nop	NOP	Legacy	Base	Does nothing (alias for XCHG EAX, EAX).
not	NOT r/m	Legacy	Base	Reverses bits of operand.
or	OR r/m, r	Legacy	Base	Performs bitwise OR.
orps	ORPS xmm, xmm/m128	SSE	SSE	Bitwise OR of 128 bits.
out	OUT imm8, AL	Legacy	Base	Writes data from AL/AX/EAX to an I/O port.
out	OUT DX, AL/AX/EAX	Legacy	Base	Writes data to I/O port specified in DX.
outs	OUTSB	Legacy	Base	Writes string from memory at [ESI] to I/O port.
outsd	OUTSD	Legacy	Base	Writes doubleword from memory at [ESI] to I/O port.
outsw	OUTSW	Legacy	Base	Writes word from memory at [ESI] to I/O port.
pabsb	PABSB xmm1, xmm2/m128	SSSE3	SSSE3	Computes absolute value of bytes.
packssdw	PACKSSDW xmm, xmm/m128	SSE2	SSE2	Converts doublewords to words with saturation.
packsswb	PACKSSWB xmm, xmm/m128	SSE2	SSE2	Converts words to bytes with saturation.
packusdw	PACKUSDW xmm1, xmm2/m128	SSE4.1	SSE4.1	Converts signed dwords to unsigned words with saturation.
packuswb	PACKUSWB xmm1, xmm2/m128	SSE2	SSE2	Converts signed words to unsigned bytes with saturation.
paddb	PADDB xmm, xmm/m128	SSE2	SSE2	Adds 16 bytes (Wraparound).
paddd	PADDD xmm, xmm/m128	SSE2	SSE2	Adds 4 doublewords (Wraparound).
paddq	PADDQ xmm, xmm/m128	SSE2	SSE2	Adds 2 quadwords (Wraparound).
paddsb	PADDSB mm, mm/m64	MMX	MMX	Adds 8 signed bytes with saturation (MMX).
paddsb	PADDSB xmm, xmm/m128	SSE2	SSE2	Adds 16 signed bytes with saturation.
paddsw	PADDSW mm, mm/m64	MMX	MMX	Adds 4 signed words with saturation (MMX).
paddsw	PADDSW xmm1, xmm2/m128	SSE2	SSE2	Adds 16-bit words with signed saturation.
paddusb	PADDUSB mm, mm/m64	MMX	MMX	Adds 8 unsigned bytes with saturation (MMX).
paddusb	PADDUSB xmm, xmm/m128	SSE2	SSE2	Adds 16 unsigned bytes with saturation.
paddusw	PADDUSW mm, mm/m64	MMX	MMX	Adds 4 unsigned words with saturation (MMX).
paddusw	PADDUSW xmm1, xmm2/m128	SSE2	SSE2	Adds 16-bit words with unsigned saturation.
paddw	PADDW xmm, xmm/m128	SSE2	SSE2	Adds 8 words (Wraparound).
palignr	PALIGNR xmm1, xmm2/m128, imm8	SSSE3	SSSE3	Concatenates dest and src, extracts 128 bits byte-aligned.
pand	PAND mm, mm/m64	MMX	MMX	Bitwise AND of 64-bit MMX registers.
pand	PAND xmm, xmm/m128	SSE2	SSE2	Bitwise AND of 128-bit integers.
pandn	PANDN mm, mm/m64	MMX	MMX	Bitwise AND NOT of 64-bit MMX registers.
pause	PAUSE	Legacy	Base	Improves performance of spin-wait loops (alias for REP NOP).
pavgb	PAVGB xmm1, xmm2/m128	SSE2	SSE2	Averages packed unsigned bytes (rounded up).
pavgw	PAVGW xmm1, xmm2/m128	SSE2	SSE2	Averages packed unsigned words (rounded up).
pblendvb	PBLENDVB xmm1, xmm2/m128, <XMM0>	SSE4.1	SSE4.1	Blends bytes based on variable mask in XMM0.
pblendw	PBLENDW xmm1, xmm2/m128, imm8	SSE4.1	SSE4.1	Selects words from two sources based on immediate mask.
pclmulqdq	PCLMULQDQ xmm1, xmm2/m128, imm8	PCLMUL	PCLMULQDQ	Performs carry-less multiplication (Galois Field math for AES-GCM).
pcmpeqb	PCMPEQB mm, mm/m64	MMX	MMX	Compares bytes for equality (MMX).
pcmpeqb	PCMPEQB xmm, xmm/m128	SSE2	SSE2	Compares bytes for equality (Result mask 0xFF or 0x00).
pcmpeqd	PCMPEQD mm, mm/m64	MMX	MMX	Compares doublewords for equality (MMX).
pcmpeqd	PCMPEQD xmm, xmm/m128	SSE2	SSE2	Compares doublewords for equality.
pcmpeqq	PCMPEQQ xmm1, xmm2/m128	SSE4.1	SSE4.1	Checks if 64-bit integer elements are equal.
pcmpeqw	PCMPEQW mm, mm/m64	MMX	MMX	Compares words for equality (MMX).
pcmpeqw	PCMPEQW xmm, xmm/m128	SSE2	SSE2	Compares words for equality.
pcmpestri	PCMPESTRI xmm1, xmm2/m128, imm8	SSE4.2	SSE4.2	Complex string search/compare; returns index in ECX.
pcmpestrm	PCMPESTRM xmm1, xmm2/m128, imm8	SSE4.2	SSE4.2	Complex string search/compare; returns mask in XMM0.
pcmpgtb	PCMPGTB mm, mm/m64	MMX	MMX	Compares bytes for greater than (MMX).
pcmpgtb	PCMPGTB xmm1, xmm2/m128	SSE2	SSE2	Compares bytes for greater than (signed).
pcmpgtd	PCMPGTD mm, mm/m64	MMX	MMX	Compares doublewords for greater than (MMX).
pcmpgtd	PCMPGTD xmm1, xmm2/m128	SSE2	SSE2	Compares doublewords for greater than (signed).
pcmpgtq	PCMPGTQ xmm1, xmm2/m128	SSE4.2	SSE4.2	Compares quadwords for greater than (signed).
pcmpgtw	PCMPGTW mm, mm/m64	MMX	MMX	Compares words for greater than (MMX).
pcmpgtw	PCMPGTW xmm1, xmm2/m128	SSE2	SSE2	Compares words for greater than (signed).
pcmpistri	PCMPISTRI xmm1, xmm2/m128, imm8	SSE4.2	SSE4.2	String search (null-terminated); returns index in ECX.
pconfig	PCONFIG	Legacy	PCONFIG	Configures platform features like MKTME (Memory Encryption).
pdep	PDEP r32, r32, r/m32	VEX	BMI2	Scatters bits from LSB of source to positions marked in mask.
pext	PEXT r32, r32, r/m32	VEX	BMI2	Extracts bits from source using mask and packs them to LSB.
pextrb	PEXTRB r32/m8, xmm1, imm8	SSE4.1	SSE4.1	Extracts a byte from XMM to integer register.
pextrd	PEXTRD r32/m32, xmm1, imm8	SSE4.1	SSE4.1	Extracts a doubleword from XMM to register.
pextrq	PEXTRQ r64/m64, xmm1, imm8	SSE4.1	SSE4.1	Extracts a quadword from XMM to register.
pextrw	PEXTRW r32, xmm1, imm8	SSE	SSE	Extracts a word from XMM to integer register.
pfadd	PFADD mm, mm/m64	3DNow!	3DNow!	Adds two packed floats (3DNow!).
pfmul	PFMUL mm, mm/m64	3DNow!	3DNow!	Multiplies packed floats (3DNow!).
pfrcp	PFRCP mm, mm/m64	3DNow!	3DNow!	Approximates reciprocal (3DNow!).
pfrsqrt	PFRSQRT mm, mm/m64	3DNow!	3DNow!	Approximates reciprocal sqrt (3DNow!).
pfsub	PFSUB mm, mm/m64	3DNow!	3DNow!	Subtracts packed floats (3DNow!).
phaddw	PHADDW xmm1, xmm2/m128	SSSE3	SSSE3	Adds adjacent 16-bit integers horizontally.
phminposuw	PHMINPOSUW xmm1, xmm2/m128	SSE4.1	SSE4.1	Finds minimum word and its index.
phsubd	PHSUBD xmm1, xmm2/m128	SSSE3	SSSE3	Subtracts adjacent 32-bit integers horizontally.
phsubw	PHSUBW xmm1, xmm2/m128	SSSE3	SSSE3	Subtracts adjacent 16-bit integers horizontally.
pinsrb	PINSRB xmm1, r32/m8, imm8	SSE4.1	SSE4.1	Inserts a byte from integer register into XMM.
pinsrd	PINSRD xmm1, r32/m32, imm8	SSE4.1	SSE4.1	Inserts a doubleword from register to XMM.
pinsrq	PINSRQ xmm1, r64/m64, imm8	SSE4.1	SSE4.1	Inserts a quadword from register to XMM.
pinsrw	PINSRW xmm1, r32/m16, imm8	SSE	SSE	Inserts a word from integer register into XMM.
pmaddubsw	PMADDUBSW xmm1, xmm2/m128	SSSE3	SSSE3	Multiplies signed/unsigned bytes and adds pairs to words.
pmaddwd	PMADDWD mm, mm/m64	MMX	MMX	Multiplies words and adds adjacent pairs (MMX).
pmaddwd	PMADDWD xmm1, xmm2/m128	SSE2	SSE2	Multiplies words, adds adjacent pairs to doublewords.
pmaxsb	PMAXSB xmm1, xmm2/m128	SSE4.1	SSE4.1	Returns maximum of signed bytes.
pmaxsd	PMAXSD xmm1, xmm2/m128	SSE4.1	SSE4.1	Returns maximum of signed doublewords.
pmaxsw	PMAXSW xmm1, xmm2/m128	SSE2	SSE2	Returns maximum of signed words.
pmaxub	PMAXUB xmm1, xmm2/m128	SSE2	SSE2	Returns maximum of unsigned bytes.
pmaxud	PMAXUD xmm1, xmm2/m128	SSE4.1	SSE4.1	Returns maximum of unsigned doublewords.
pmaxuw	PMAXUW xmm1, xmm2/m128	SSE4.1	SSE4.1	Returns maximum of unsigned words.
pminsb	PMINSB xmm1, xmm2/m128	SSE4.1	SSE4.1	Returns minimum of signed bytes.
pminsd	PMINSD xmm1, xmm2/m128	SSE4.1	SSE4.1	Returns minimum of signed doublewords.
pminsw	PMINSW xmm1, xmm2/m128	SSE2	SSE2	Returns minimum of signed words.
pminub	PMINUB xmm1, xmm2/m128	SSE2	SSE2	Returns minimum of unsigned bytes.
pminud	PMINUD xmm1, xmm2/m128	SSE4.1	SSE4.1	Returns minimum of unsigned doublewords.
pminuw	PMINUW xmm1, xmm2/m128	SSE4.1	SSE4.1	Returns minimum of unsigned words.
pmovmskb	PMOVMSKB r32, xmm	SSE2	SSE2	Creates a mask from the MSB of each byte in XMM.
pmovsxbq	PMOVSXBQ xmm1, xmm2/m16	SSE4.1	SSE4.1	Sign extends 8-bit integers to 64-bit.
pmovsxbw	PMOVSXBW xmm1, xmm2/m64	SSE4.1	SSE4.1	Sign extends 8-bit integers to 16-bit.
pmovsxdq	PMOVSXDQ xmm1, xmm2/m64	SSE4.1	SSE4.1	Sign extends 32-bit integers to 64-bit.
pmovsxwd	PMOVSXWD xmm1, xmm2/m64	SSE4.1	SSE4.1	Sign extends 16-bit integers to 32-bit.
pmovsxwq	PMOVSXWQ xmm1, xmm2/m32	SSE4.1	SSE4.1	Sign extends 16-bit integers to 64-bit.
pmovzxbd	PMOVZXBD xmm1, xmm2/m32	SSE4.1	SSE4.1	Zero extends 8-bit integers to 32-bit.
pmovzxbq	PMOVZXBQ xmm1, xmm2/m16	SSE4.1	SSE4.1	Zero extends 8-bit integers to 64-bit.
pmovzxbw	PMOVZXBW xmm1, xmm2/m64	SSE4.1	SSE4.1	Zero extends 8-bit integers to 16-bit.
pmovzxdq	PMOVZXDQ xmm1, xmm2/m64	SSE4.1	SSE4.1	Zero extends 32-bit integers to 64-bit.
pmovzxwd	PMOVZXWD xmm1, xmm2/m64	SSE4.1	SSE4.1	Zero extends 16-bit integers to 32-bit.
pmovzxwq	PMOVZXWQ xmm1, xmm2/m32	SSE4.1	SSE4.1	Zero extends 16-bit integers to 64-bit.
pmulhrsw	PMULHRSW xmm1, xmm2/m128	SSSE3	SSSE3	Multiplies signed 16-bit words, rounds, and scales.
pmulhuw	PMULHUW xmm1, xmm2/m128	SSE2	SSE2	Multiplies unsigned words, keeps high 16 bits.
pmulhw	PMULHW mm, mm/m64	MMX	MMX	Multiplies 4 signed words and stores high 16 bits (MMX).
pmulhw	PMULHW xmm1, xmm2/m128	SSE2	SSE2	Multiplies signed words, keeps high 16 bits.
pmulld	PMULLD xmm1, xmm2/m128	SSE4.1	SSE4.1	Multiplies 32-bit integers, stores low 32-bit result.
pmullw	PMULLW mm, mm/m64	MMX	MMX	Multiplies 4 words and stores low 16 bits (MMX).
pmullw	PMULLW xmm1, xmm2/m128	SSE2	SSE2	Multiplies 16-bit words and stores low 16-bit result.
pmuludq	PMULUDQ xmm1, xmm2/m128	SSE2	SSE2	Multiplies low 32-bits of each 64-bit chunk to 64-bit result.
pop	POP r/m	Legacy	Base	Loads operand from stack and increments SP.
popa	POPA	Legacy	Base (32-bit only)	Pops into DI, SI, BP, SP, BX, DX, CX, AX (Invalid in 64-bit).
popcnt	POPCNT r, r/m	VEX	SSE4.2	Counts number of bits set to 1.
popf	POPF	Legacy	Base	Pops stack into EFLAGS.
por	POR mm, mm/m64	MMX	MMX	Bitwise OR of 64-bit MMX registers.
por	POR xmm, xmm/m128	SSE2	SSE2	Bitwise OR of 128-bit integers.
prefetchnta	PREFETCHNTA m8	SSE	SSE	Prefetches data to non-temporal cache structure (minimize pollution).
prefetcht0	PREFETCHT0 m8	SSE	SSE	Prefetches data to L1 cache.
prefetcht1	PREFETCHT1 m8	SSE	SSE	Hints to fetch data to L2 and L3 caches.
prefetcht2	PREFETCHT2 m8	SSE	SSE	Hints to fetch data to L3 cache only.
prefetchw	PREFETCHW m8	Legacy	PREFETCHW	Prefetches data with intent to write (RFO).
prefetchwt1	PREFETCHWT1 m8	Legacy	PREFETCHWT1	Prefetches data to L2 (T1 hint) with intent to write.
psadbw	PSADBW xmm1, xmm2/m128	SSE2	SSE2	Computes absolute differences of bytes and sums them to words.
pshufb	PSHUFB xmm1, xmm2/m128	SSSE3	SSSE3	Shuffles bytes according to indices in source operand.
pshufd	PSHUFD xmm, xmm/m128, imm8	SSE2	SSE2	Shuffles 32-bit integers.
pshufhw	PSHUFHW xmm1, xmm2/m128, imm8	SSE2	SSE2	Shuffles the high 4 words of XMM.
pshuflw	PSHUFLW xmm1, xmm2/m128, imm8	SSE2	SSE2	Shuffles the low 4 words of XMM.
psignb	PSIGNB xmm1, xmm2/m128	SSSE3	SSSE3	Negates/Zeroes bytes in dest based on sign of src.
pslld	PSLLD mm, imm8	MMX	MMX	Shifts doublewords left (MMX).
pslld	PSLLD xmm, imm8	SSE2	SSE2	Shifts doublewords left.
pslldq	PSLLDQ xmm1, imm8	SSE2	SSE2	Shifts the entire 128-bit register left by bytes.
psllq	PSLLQ mm, imm8	MMX	MMX	Shifts quadword left (MMX).
psllw	PSLLW mm, imm8	MMX	MMX	Shifts words left (MMX).
psllw	PSLLW xmm, imm8	SSE2	SSE2	Shifts words left.
psrad	PSRAD mm, imm8	MMX	MMX	Shifts doublewords right arithmetic (MMX).
psrad	PSRAD xmm, imm8	SSE2	SSE2	Shifts doublewords right arithmetic.
psraw	PSRAW mm, imm8	MMX	MMX	Shifts words right arithmetic (MMX).
psraw	PSRAW xmm, imm8	SSE2	SSE2	Shifts words right arithmetic (sign bit).
psrld	PSRLD mm, imm8	MMX	MMX	Shifts doublewords right logical (MMX).
psrld	PSRLD xmm, imm8	SSE2	SSE2	Shifts doublewords right logical.
psrldq	PSRLDQ xmm1, imm8	SSE2	SSE2	Shifts the entire 128-bit register right by bytes.
psrlq	PSRLQ mm, imm8	MMX	MMX	Shifts quadword right logical (MMX).
psrlw	PSRLW mm, imm8	MMX	MMX	Shifts words right logical (MMX).
psrlw	PSRLW xmm, imm8	SSE2	SSE2	Shifts words right logical.
psubb	PSUBB xmm, xmm/m128	SSE2	SSE2	Subtracts 16 bytes.
psubd	PSUBD xmm, xmm/m128	SSE2	SSE2	Subtracts 4 doublewords.
psubq	PSUBQ xmm1, xmm2/m128	SSE2	SSE2	Subtracts packed quadwords.
psubsb	PSUBSB mm, mm/m64	MMX	MMX	Subtracts 8 signed bytes with saturation (MMX).
psubsw	PSUBSW mm, mm/m64	MMX	MMX	Subtracts 4 signed words with saturation (MMX).
psubsw	PSUBSW xmm1, xmm2/m128	SSE2	SSE2	Subtracts 16-bit words with signed saturation.
psubusb	PSUBUSB mm, mm/m64	MMX	MMX	Subtracts 8 unsigned bytes with saturation (MMX).
psubusw	PSUBUSW mm, mm/m64	MMX	MMX	Subtracts 4 unsigned words with saturation (MMX).
psubusw	PSUBUSW xmm1, xmm2/m128	SSE2	SSE2	Subtracts 16-bit words with unsigned saturation.
psubw	PSUBW xmm, xmm/m128	SSE2	SSE2	Subtracts 8 words.
ptest	PTEST xmm1, xmm2/m128	SSE4.1	SSE4.1	Bitwise compare of 128-bit value (AND) setting flags.
ptwrite	PTWRITE r32/r64	Legacy	PTWRITE	Writes data to the Intel Processor Trace stream.
punpckhbw	PUNPCKHBW xmm1, xmm2/m128	SSE2	SSE2	Interleaves high bytes from two sources.
punpckhdq	PUNPCKHDQ xmm1, xmm2/m128	SSE2	SSE2	Interleaves high doublewords.
punpckhqdq	PUNPCKHQDQ xmm1, xmm2/m128	SSE2	SSE2	Interleaves high quadwords.
punpckhwd	PUNPCKHWD xmm1, xmm2/m128	SSE2	SSE2	Interleaves high words.
punpcklbw	PUNPCKLBW xmm, xmm/m128	SSE2	SSE2	Interleaves low bytes from two sources.
punpckldq	PUNPCKLDQ xmm, xmm/m128	SSE2	SSE2	Interleaves low doublewords.
punpcklqdq	PUNPCKLQDQ xmm, xmm/m128	SSE2	SSE2	Interleaves low quadwords.
punpcklwd	PUNPCKLWD xmm, xmm/m128	SSE2	SSE2	Interleaves low words.
push	PUSH r/m	Legacy	Base	Decrements SP and stores operand on stack.
pusha	PUSHA	Legacy	Base (32-bit only)	Pushes AX, CX, DX, BX, SP, BP, SI, DI (Invalid in 64-bit).
pushf	PUSHF	Legacy	Base	Pushes EFLAGS onto stack.
pxor	PXOR mm, mm/m64	MMX	MMX	Bitwise XOR of 64-bit MMX registers.
pxor	PXOR xmm, xmm/m128	SSE2	SSE2	Bitwise XOR of 128-bit integers.
rcl	RCL r/m, imm8	Legacy	Base	Rotates bits left through Carry Flag.
rcpps	RCPPS xmm, xmm/m128	SSE	SSE	Approximate reciprocal (1/x) of four 32-bit floats.
rcpss	RCPSS xmm1, xmm2/m32	SSE	SSE	Computes approximate reciprocal (1/x) of low float.
rcr	RCR r/m, imm8	Legacy	Base	Rotates bits right through Carry Flag.
rdfsbase	RDFSBASE r64	Legacy	FSGSBASE	Reads the FS base address into a register.
rdgsbase	RDGSBASE r64	Legacy	FSGSBASE	Reads the GS base address into a register.
rdmsr	RDMSR	System	System	Reads MSR specified by ECX into EDX:EAX (Privileged).
rdpid	RDPID r32	Legacy	RDPID	Reads the processor ID (TSC_AUX) into register.
rdpkru	RDPKRU	Legacy	PKU	Reads PKRU register into EAX (User-mode pages).
rdpmc	RDPMC	System	System	Reads performance counter specified by ECX into EDX:EAX.
rdrand	RDRAND r32	Legacy	RDRAND	Retrieves a hardware-generated random number.
rdseed	RDSEED r32	Legacy	RDSEED	Retrieves a random seed from hardware entropy source.
rdsspq	RDSSPQ r64	Legacy	CET-SS	Reads the current shadow stack pointer into a register.
rdtsc	RDTSC	Legacy	Base	Reads the time-stamp counter into EDX:EAX.
rdtscp	RDTSCP	Legacy	Base	Reads TSC into EDX:EAX and Processor ID into ECX.
rep movs	REP MOVS m, m	Legacy	Base	Moves ECX bytes/words from [ESI] to [EDI].
rep stos	REP STOS m	Legacy	Base	Fills [EDI] with AL/AX/EAX for ECX repeats.
repe cmps	REPE CMPS m, m	Legacy	Base	Compares [ESI] and [EDI] until mismatch or ECX=0.
repne scas	REPNE SCAS m	Legacy	Base	Scans [EDI] for AL/AX/EAX until match or ECX=0.
ret	RET	Legacy	Base	Pop EIP/RIP and resume execution.
rol	ROL r/m, imm8	Legacy	Base	Rotates bits left.
ror	ROR r/m, imm8	Legacy	Base	Rotates bits right.
rorx	RORX r32, r/m32, imm8	VEX	BMI2	Rotate right with immediate. No flags update.
roundpd	ROUNDPD xmm1, xmm2/m128, imm8	SSE4.1	SSE4.1	Rounds all packed doubles according to immediate mode.
roundps	ROUNDPS xmm1, xmm2/m128, imm8	SSE4.1	SSE4.1	Rounds all packed floats according to immediate mode.
roundsd	ROUNDSD xmm1, xmm2/m64, imm8	SSE4.1	SSE4.1	Rounds low double according to immediate mode.
roundss	ROUNDSS xmm1, xmm2/m32, imm8	SSE4.1	SSE4.1	Rounds low float according to immediate mode.
rsm	RSM	System	System (SMM)	Exits SMM and returns to previous state (Privileged).
rsqrtps	RSQRTPS xmm, xmm/m128	SSE	SSE	Approximate reciprocal sqrt (1/sqrt(x)) of four 32-bit floats.
rsqrtss	RSQRTSS xmm1, xmm2/m32	SSE	SSE	Computes approximate reciprocal sqrt (1/sqrt(x)) of low float.
rstorssp	RSTORSSP m64	Legacy	CET-SS	Restores SSP from memory token.
sahf	SAHF	Legacy	Base	Loads SF, ZF, AF, PF, and CF from AH.
sal	SAL r/m, imm8	Legacy	Base	Shifts bits left (Alias for SHL).
sar	SAR r/m, imm8	Legacy	Base	Shifts bits right, preserving sign bit.
sarx	SARX r32, r/m32, r32	VEX	BMI2	Arithmetic right shift, count in register. No flags update.
saveprevssp	SAVEPREVSSP	Legacy	CET-SS	Saves the previous SSP to the shadow stack token.
sbb	SBB r/m, r	Legacy	Base	Subtracts operands and the Carry Flag (CF).
scas	SCASB	Legacy	Base	Compares AL/AX/EAX with memory at [EDI].
scasd	SCASD	Legacy	Base	Compares EAX with memory at [EDI].
scasq	SCASQ	Legacy	Base (64-bit)	Compares RAX with memory at [RDI].
scasw	SCASW	Legacy	Base	Compares AX with memory at [EDI].
senduipi	SENDUIPI r64	Legacy	UINTR	Sends a User IPI to another processor.
serialize	SERIALIZE	Legacy	SERIALIZE	Forces serialization of instruction fetch/execution.
seta	SETA r/m8	Legacy	Base	Sets byte to 1 if CF=0 and ZF=0.
setae	SETAE r/m8	Legacy	Base	Sets byte to 1 if CF=0.
setb	SETB r/m8	Legacy	Base	Sets byte to 1 if CF=1.
setbe	SETBE r/m8	Legacy	Base	Sets byte to 1 if CF=1 or ZF=1.
setcc	SETcc r/m8	Legacy	Base	Sets byte to 1 if condition met, else 0 (e.g., SETE, SETZ).
setg	SETG r/m8	Legacy	Base	Sets byte to 1 if ZF=0 and SF=OF.
setge	SETGE r/m8	Legacy	Base	Sets byte to 1 if SF=OF.
setl	SETL r/m8	Legacy	Base	Sets byte to 1 if SF!=OF.
setle	SETLE r/m8	Legacy	Base	Sets byte to 1 if ZF=1 or SF!=OF.
setno	SETNO r/m8	Legacy	Base	Sets byte to 1 if OF=0.
setnp	SETNP r/m8	Legacy	Base	Sets byte to 1 if PF=0 (Odd Parity).
setns	SETNS r/m8	Legacy	Base	Sets byte to 1 if SF=0 (Positive).
setnz	SETNZ r/m8	Legacy	Base	Sets byte to 1 if ZF=0.
seto	SETO r/m8	Legacy	Base	Sets byte to 1 if OF=1.
setp	SETP r/m8	Legacy	Base	Sets byte to 1 if PF=1 (Even Parity).
sets	SETS r/m8	Legacy	Base	Sets byte to 1 if SF=1 (Negative).
setz	SETZ r/m8	Legacy	Base	Sets byte to 1 if ZF=1.
sfence	SFENCE	SSE	SSE	Serializes store operations (Wait for prior stores to complete).
sgdt	SGDT m	System	System	Stores GDT limit and base address to memory.
sha1msg1	SHA1MSG1 xmm1, xmm2/m128	Legacy	SHA	Performs intermediate calculation for SHA1 message schedule.
sha1msg2	SHA1MSG2 xmm1, xmm2/m128	Legacy	SHA	Performs final calculation for SHA1 message schedule.
sha1nexte	SHA1NEXTE xmm1, xmm2/m128	Legacy	SHA	Calculates SHA1 state variable E.
sha1rnds4	SHA1RNDS4 xmm1, xmm2/m128, imm8	Legacy	SHA	Performs 4 rounds of SHA1 operation.
sha256msg1	SHA256MSG1 xmm1, xmm2/m128	Legacy	SHA	Performs intermediate calculation for SHA256 message schedule.
sha256msg2	SHA256MSG2 xmm1, xmm2/m128	Legacy	SHA	Performs final calculation for SHA256 message schedule.
sha256rnds2	SHA256RNDS2 xmm1, xmm2/m128, xmm0	Legacy	SHA	Performs 2 rounds of SHA256 operation.
shl	SHL r/m, imm8	Legacy	Base	Shifts bits left (same as SAL).
shld	SHLD r/m, r, imm8	Legacy	Base	Shifts dest left, filling with bits from src.
shlx	SHLX r32, r/m32, r32	VEX	BMI2	Logical left shift, count in register. No flags update.
shr	SHR r/m, imm8	Legacy	Base	Shifts bits right, filling with zeros.
shrd	SHRD r/m, r, imm8	Legacy	Base	Shifts dest right, filling with bits from src.
shrx	SHRX r32, r/m32, r32	VEX	BMI2	Logical right shift, count in register. No flags update.
shufpd	SHUFPD xmm1, xmm2/m128, imm8	SSE2	SSE2	Shuffles 64-bit doubles between two XMM registers.
shufps	SHUFPS xmm1, xmm2/m128, imm8	SSE	SSE	Shuffles 32-bit floats between two XMM registers.
sidt	SIDT m	System	System	Stores IDT limit and base address to memory.
sldt	SLDT r/m16	System	System	Stores LDT segment selector.
smsw	SMSW r/m16	System	System	Stores Machine Status Word.
sqrtpd	SQRTPD xmm, xmm/m128	SSE2	SSE2	Computes square root of two 64-bit doubles.
sqrtps	SQRTPS xmm, xmm/m128	SSE	SSE	Computes square root of four 32-bit floats.
sqrtsd	SQRTSD xmm1, xmm2/m64	SSE2	SSE2	Computes square root of the low double.
sqrtss	SQRTSS xmm1, xmm2/m32	SSE	SSE	Computes square root of the low float.
stac	STAC	Legacy	SMAP	Sets Alignment Check flag (Allow user memory access).
stc	STC	Legacy	Base	Sets the CF flag to 1.
std	STD	Legacy	Base	Sets DF to 1 (String operations decrement).
stgi	STGI	SVM	SVM	Enables global interrupts (AMD SVM).
sti	STI	Legacy	Base	Enables maskable hardware interrupts.
stmxcsr	STMXCSR m32	SSE	SSE	Stores the MXCSR register to memory.
stos	STOSB	Legacy	Base	Stores AL/AX/EAX to memory at [EDI].
stosd	STOSD	Legacy	Base	Stores EAX to memory at [EDI].
stosq	STOSQ	Legacy	Base (64-bit)	Stores RAX to memory at [RDI].
stosw	STOSW	Legacy	Base	Stores AX to memory at [EDI].
str	STR r/m16	System	System	Stores Task Register.
sttilecfg	STTILECFG m512	VEX	AMX-TILE	Stores AMX tile configuration to memory.
stui	STUI	Legacy	UINTR	Sets the User Interrupt Flag (UIF).
sub	SUB r/m, r	Legacy	Base	Subtracts src from dest.
subpd	SUBPD xmm, xmm/m128	SSE2	SSE2	Subtracts two 64-bit doubles.
subps	SUBPS xmm, xmm/m128	SSE	SSE	Subtracts four 32-bit floats.
subsd	SUBSD xmm1, xmm2/m64	SSE2	SSE2	Subtracts the low double-precision floating-point value.
subss	SUBSS xmm1, xmm2/m32	SSE	SSE	Subtracts the low single-precision floating-point value.
swapgs	SWAPGS	Legacy	Base (64-bit System)	Swaps user/kernel GS base address (System).
syscall	SYSCALL	System	System (64-bit)	Fast call to privilege level 0 system procedures.
sysenter	SYSENTER	System	System	Fast call to level 0 system procedures.
sysexit	SYSEXIT	System	System	Fast return to level 3 user code.
sysret	SYSRET	System	System (64-bit)	Fast return to privilege level 3 user code.
t1mskc	T1MSKC r32, r/m32	TBM	TBM	Creates mask from trailing ones (~x \| (x+1)).
tdpbf16ps	TDPBF16PS tmm1, tmm2, tmm3	VEX	AMX-BF16	Matrix multiply (BFloat16) accumulating to Float32.
tdpbssd	TDPBSSD tmm1, tmm2, tmm3	VEX	AMX-INT8	Matrix multiply (Signed Int8 * Signed Int8) accumulating to Int32.
tdpbsud	TDPBSUD tmm1, tmm2, tmm3	VEX	AMX-INT8	Matrix multiply (Signed * Unsigned) accumulating to Int32.
tdpbusd	TDPBUSD tmm1, tmm2, tmm3	VEX	AMX-INT8	Matrix multiply (Unsigned * Signed) accumulating to Int32.
tdpbuud	TDPBUUD tmm1, tmm2, tmm3	VEX	AMX-INT8	Matrix multiply (Unsigned * Unsigned) accumulating to Int32.
tdpfp16ps	TDPFP16PS tmm1, tmm2, tmm3	VEX	AMX-FP16	Matrix multiply (FP16 * FP16) accumulating to Float32.
test	TEST r/m, r	Legacy	Base	ANDs operands and updates flags (result discarded).
testui	TESTUI	Legacy	UINTR	Sets CF if UIF is 1, ZF if User Interrupt Pending.
tileloadd	TILELOADD tmm1, m	VEX	AMX-TILE	Loads data into an AMX tile register.
tileloaddt1	TILELOADDT1 tmm1, m	VEX	AMX-TILE	Loads data into an AMX tile register with T1 hint.
tilestored	TILESTORED m, tmm1	VEX	AMX-TILE	Stores data from an AMX tile register to memory.
tilezero	TILEZERO tmm1	VEX	AMX-TILE	Clears an AMX tile register.
tpause	TPAUSE r32	Legacy	WAITPKG	Pauses execution for a specified time or until trigger.
tzcnt	TZCNT r32, r/m32	Legacy	BMI1	Counts the number of trailing zeros.
tzmsk	TZMSK r32, r/m32	TBM	TBM	Creates mask from trailing zeros (~x & (x-1)).
ucomisd	UCOMISD xmm1, xmm2/m64	SSE2	SSE2	Compares low double and sets EFLAGS.
ucomiss	UCOMISS xmm1, xmm2/m32	SSE	SSE	Compares low float and sets EFLAGS.
ud0	UD0	Legacy	Base	Generates invalid opcode exception.
ud2	UD2	Legacy	Base	Generates an invalid opcode exception.
uiret	UIRET	Legacy	UINTR	Returns from a User Interrupt handler.
umonitor	UMONITOR r64	Legacy	WAITPKG	Sets up a monitor address for User Wait instructions.
umwait	UMWAIT r32	Legacy	WAITPKG	Waits for store to monitored address (Low power state).
unpckhpd	UNPCKHPD xmm1, xmm2/m128	SSE2	SSE2	Interleaves high doubles from two sources.
unpckhps	UNPCKHPS xmm1, xmm2/m128	SSE	SSE	Interleaves high floats from two registers.
unpcklpd	UNPCKLPD xmm1, xmm2/m128	SSE2	SSE2	Interleaves low doubles from two sources.
unpcklps	UNPCKLPS xmm1, xmm2/m128	SSE	SSE	Interleaves low floats from two registers.
v4fmaddps	V4FMADDPS zmm1 {k1}, zmm2+3, m128	EVEX	AVX-512-4FMAPS	4-way FMA for Neural Nets (Single).
v4fmaddss	V4FMADDSS xmm1 {k1}, xmm2+3, m128	EVEX	AVX-512-4FMAPS	4-way FMA for Neural Nets (Scalar).
v4fnmaddps	V4FNMADDPS zmm1 {k1}, zmm2+3, m128	EVEX	AVX-512-4FMAPS	4-way Negative FMA for Neural Nets (Single).
v4fnmaddss	V4FNMADDSS xmm1 {k1}, xmm2+3, m128	EVEX	AVX-512-4FMAPS	4-way Negative FMA for Neural Nets (Scalar).
vaddph	VADDPH zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-FP16	Adds half-precision floating-point values.
vaddps	VADDPS ymm1, ymm2, ymm3/m256	VEX	AVX	Adds packed floats (256-bit YMM support).
vaddsh	VADDSH xmm1 {k1}, xmm2, xmm3/m16	EVEX	AVX-512-FP16	Adds low FP16 value.
vaddss	VADDSS xmm1 {k1}, xmm2, xmm3/m32	EVEX	AVX-512F	Adds scalar single precision (EVEX encoded with masking).
vaesdec	VAESDEC zmm1, zmm2, zmm3/m512	EVEX	AVX-512-VAES	AES Decrypt on 512-bit vector.
vaesenc	VAESENC zmm1, zmm2, zmm3/m512	EVEX	AVX-512-VAES	AES Encrypt on 512-bit vector.
valignd	VALIGND zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512F	Extracts 512-bits from two concatenated ZMMs shifted by count.
valignq	VALIGNQ zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512F	Extracts 512-bits from two concatenated ZMMs shifted by count.
vbroadcastf128	VBROADCASTF128 ymm1, m128	VEX	AVX	Broadcasts 128-bit FP block to YMM.
vbroadcasti128	VBROADCASTI128 ymm1, m128	VEX	AVX2	Broadcasts 128-bit integer block to YMM.
vbroadcastsd	VBROADCASTSD ymm1, m64	VEX	AVX2	Broadcasts a double to all elements of YMM.
vbroadcastss	VBROADCASTSS ymm1, m32	AVX	AVX	Loads one float and replicates it to all YMM elements.
vcmppd	VCMPPD ymm1, ymm2, ymm3/m256, imm8	VEX	AVX	Compares packed doubles (AVX version with immediate).
vcmpps	VCMPPS ymm1, ymm2, ymm3/m256, imm8	VEX	AVX	Compares packed floats (AVX version with immediate).
vcompresspd	VCOMPRESSPD m512 {k1}, zmm1	EVEX	AVX-512F	Compresses active elements from ZMM to memory.
vcompressps	VCOMPRESSPS m512 {k1}, zmm1	EVEX	AVX-512F	Compresses active elements from ZMM to memory.
vcvtdq2ps	VCVTDQ2PS ymm1, ymm2/m256	VEX	AVX	Converts four 32-bit integers to floats.
vcvtne2ps2bf16	VCVTNE2PS2BF16 zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-BF16	Converts two float vectors to one BFloat16 vector.
vcvtpd2udq	VCVTPD2UDQ ymm1 {k1}, zmm2/m512	EVEX	AVX-512F	Converts 64-bit doubles to unsigned 32-bit integers.
vcvtpd2uqq	VCVTPD2UQ zmm1 {k1}, zmm2/m512	EVEX	AVX-512F	Converts 64-bit doubles to unsigned 64-bit integers.
vcvtph2ps	VCVTPH2PS xmm1, xmm2/m64	VEX	F16C	Converts half-precision floats to single-precision.
vcvtps2dq	VCVTPS2DQ ymm1, ymm2/m256	VEX	AVX	Converts four floats to 32-bit integers (Rounded).
vcvtps2ph	VCVTPS2PH xmm1/m64, xmm2, imm8	VEX	F16C	Converts single-precision floats to half-precision.
vcvtps2udq	VCVTPS2UDQ zmm1 {k1}, zmm2/m512	EVEX	AVX-512F	Converts 32-bit floats to unsigned 32-bit integers.
vcvtps2uqq	VCVTPS2UQ zmm1 {k1}, ymm2/m256	EVEX	AVX-512F	Converts 32-bit floats to unsigned 64-bit integers.
vcvttps2dq	VCVTTPS2DQ ymm1, ymm2/m256	VEX	AVX	Converts four floats to 32-bit integers (Truncated).
vcvtudq2pd	VCVTUDQ2PD zmm1 {k1}, ymm2/m256	EVEX	AVX-512F	Converts unsigned 32-bit integers to 64-bit doubles.
vcvtudq2ps	VCVTUDQ2PS zmm1 {k1}, zmm2/m512	EVEX	AVX-512F	Converts unsigned int32 to float.
vcvtuqq2pd	VCVTUQ2PD zmm1 {k1}, zmm2/m512	EVEX	AVX-512F	Converts unsigned 64-bit integers to 64-bit doubles.
vcvtuqq2ps	VCVTUQ2PS ymm1 {k1}, zmm2/m512	EVEX	AVX-512F	Converts unsigned 64-bit integers to 32-bit floats.
vdbpsadbw	VDBPSADBW zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512BW	Computes SAD on 16-bit blocks.
vdivph	VDIVPH zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-FP16	Divides half-precision floating-point values.
vdivsh	VDIVSH xmm1 {k1}, xmm2, xmm3/m16	EVEX	AVX-512-FP16	Divides low FP16 value.
vdpbf16ps	VDPBF16PS zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-BF16	BFloat16 dot product accumulating to Float32.
verr	VERR r/m16	System	System	Checks if segment can be read; sets ZF.
verw	VERW r/m16	System	System	Checks if segment can be written; sets ZF.
vexpandpd	VEXPANDPD zmm1 {k1}, m512	EVEX	AVX-512F	Expands data from memory into sparse locations in ZMM.
vexpandps	VEXPANDPS zmm1 {k1}, m512	EVEX	AVX-512F	Expands data from memory into sparse locations in ZMM.
vextractf128	VEXTRACTF128 xmm1/m128, ymm2, imm8	VEX	AVX	Extracts 128-bits from YMM register.
vextracti128	VEXTRACTI128 xmm1/m128, ymm2, imm8	VEX	AVX2	Extracts 128-bits of integer data from YMM.
vfcmaddcph	VFCMADDCPH zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-FP16	Complex conjugate multiply-add for half-precision.
vfixupimmpd	VFIXUPIMMPD zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512F	Fixes special cases (NaN, Inf) using a table.
vfixupimmps	VFIXUPIMMPS zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512F	Fixes special cases (NaN, Inf) using a table (Float32).
vfixupimmss	VFIXUPIMMSS xmm1 {k1}, xmm2, xmm3/m32, imm8	EVEX	AVX-512F	Fixes special cases (NaN, Inf) in low float using table.
vfmadd132ph	VFMADD132PH zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-FP16	Computes (Dest * Src2) + Src1 in half-precision.
vfmadd132ps	VFMADD132PS ymm1, ymm2, ymm3/m256	VEX	FMA3	Computes (Dest * Src2) + Src1.
vfmadd132sh	VFMADD132SH xmm1 {k1}, xmm2, xmm3/m16	EVEX	AVX-512-FP16	Scalar FMA (Dest * Src2 + Src1) for FP16.
vfmadd132ss	VFMADD132SS xmm1, xmm2, xmm3/m32	FMA3	FMA3	Scalar FMA: Dest = (Dest * Src2) + Src1.
vfmadd213ph	VFMADD213PH zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-FP16	Computes (Src1 * Dest) + Src2 in half-precision.
vfmadd213ps	VFMADD213PS ymm1, ymm2, ymm3/m256	VEX	FMA3	Computes (Src1 * Dest) + Src2.
vfmadd213ss	VFMADD213SS xmm1, xmm2, xmm3/m32	FMA3	FMA3	Scalar FMA: Dest = (Src1 * Dest) + Src2.
vfmadd231ph	VFMADD231PH zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-FP16	Computes (Src1 * Src2) + Dest in half-precision.
vfmadd231ps	VFMADD231PS ymm1, ymm2, ymm3/m256	VEX	FMA3	Computes (Dest * Src2) + Src1.
vfmadd231ss	VFMADD231SS xmm1, xmm2, xmm3/m32	FMA3	FMA3	Scalar FMA: Dest = (Src1 * Src2) + Dest.
vfmaddcph	VFMADDCPH zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-FP16	Complex multiply-add for half-precision.
vfmaddcsh	VFMADDCSH xmm1 {k1}, xmm2, xmm3/m32	EVEX	AVX-512-FP16	Complex multiply-add for scalar half-precision.
vfmsub132ps	VFMSUB132PS ymm1, ymm2, ymm3/m256	VEX	FMA3	Computes (Dest * Src2) - Src1.
vfnmadd132ps	VFNMADD132PS ymm1, ymm2, ymm3/m256	VEX	FMA3	Computes -(Dest * Src2) + Src1.
vfpclasspd	VFPCLASSPD k1 {k2}, zmm2/m512, imm8	EVEX	AVX-512DQ	Tests for category (NaN, Inf, Denormal) for doubles.
vfpclassps	VFPCLASSPS k1 {k2}, zmm2/m512, imm8	EVEX	AVX-512DQ	Tests for category (NaN, Inf, Denormal) for floats.
vgatherdpd	VGATHERDPD ymm1, [base+xmm_idx*scale], ymm_mask	VEX	AVX2	Loads doubles from non-contiguous memory using indices.
vgatherdps	VGATHERDPS ymm1, [base+ymm_idx*scale], ymm_mask	VEX	AVX2	Loads floats from non-contiguous memory using indices.
vgatherpf0dpd	VGATHERPF0DPD {k1}, [base+ymm_idx]	EVEX	AVX-512PF	Prefetches doubles to L1 cache using indices.
vgatherpf0dps	VGATHERPF0DPS {k1}, [base+zmm_idx]	EVEX	AVX-512PF	Prefetches floats to L1 cache using indices.
vgatherpf0qpd	VGATHERPF0QPD {k1}, [base+zmm_idx]	EVEX	AVX-512PF	Prefetches doubles to L1 using 64-bit indices.
vgatherpf0qps	VGATHERPF0QPS {k1}, [base+zmm_idx]	EVEX	AVX-512PF	Prefetches floats to L1 using 64-bit indices.
vgetexppd	VGETEXPPD zmm1 {k1}, zmm2/m512	EVEX	AVX-512F	Extracts exponents from doubles as float values.
vgetexpss	VGETEXPSS xmm1 {k1}, xmm2, xmm3/m32	EVEX	AVX-512F	Extracts exponent from low float.
vgetmantpd	VGETMANTPD zmm1 {k1}, zmm2/m512, imm8	EVEX	AVX-512F	Extracts mantissas from doubles.
vgetmantsd	VGETMANTSD xmm1 {k1}, xmm2, xmm3/m64, imm8	EVEX	AVX-512F	Extracts mantissa from low double.
vinsertf128	VINSERTF128 ymm1, ymm2, xmm3/m128, imm8	VEX	AVX	Inserts 128-bits into a YMM register.
vinserti128	VINSERTI128 ymm1, ymm2, xmm3/m128, imm8	VEX	AVX2	Inserts 128-bits of integer data into a YMM register.
vmaskmovpd	VMASKMOVPD ymm1, ymm2, m256	VEX	AVX	Conditionally loads/stores doubles based on mask.
vmaskmovps	VMASKMOVPS ymm1, ymm2, m256	VEX	AVX	Conditionally loads/stores floats based on mask.
vmaxph	VMAXPH zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-FP16	Maximum of half-precision values.
vmcall	VMCALL	VMX	VMX	Guest VM calls the Hypervisor (VM Exit).
vmclear	VMCLEAR m64	VMX	VMX	Initializes a VMCS region in memory.
vmfunc	VMFUNC	VMX	VMX	Invoke VM function specified in EAX.
vminph	VMINPH zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-FP16	Minimum of half-precision values.
vmlaunch	VMLAUNCH	VMX	VMX	Launches a VM managed by the current VMCS.
vmload	VMLOAD	SVM	SVM	Loads processor state from VMCB (AMD SVM).
vmptrld	VMPTRLD m64	VMX	VMX	Loads the current VMCS pointer from memory.
vmptrst	VMPTRST m64	VMX	VMX	Stores the current VMCS pointer to memory.
vmread	VMREAD r/m64, r64	VMX	VMX	Reads a field from the Virtual Machine Control Structure.
vmresume	VMRESUME	VMX	VMX	Resumes a VM from the current VMCS.
vmrun	VMRUN	SVM	SVM	Switch to guest VM (AMD SVM).
vmsave	VMSAVE	SVM	SVM	Saves processor state to VMCB (AMD SVM).
vmulph	VMULPH zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-FP16	Multiplies half-precision floating-point values.
vmulps	VMULPS ymm1, ymm2, ymm3/m256	VEX	AVX	Multiplies packed floats (256-bit).
vmulsh	VMULSH xmm1 {k1}, xmm2, xmm3/m16	EVEX	AVX-512-FP16	Multiplies low FP16 value.
vmulss	VMULSS xmm1 {k1}, xmm2, xmm3/m32	EVEX	AVX-512F	Multiplies scalar single precision (EVEX encoded with masking).
vmwrite	VMWRITE r64, r/m64	VMX	VMX	Writes a field to the Virtual Machine Control Structure.
vmxoff	VMOFF	VMX	VMX	Leaves VMX root operation.
vmxon	VMXON m64	VMX	VMX	Enters VMX root operation (Host Mode).
vp2intersectd	VP2INTERSECTD k1+1, zmm2, zmm3/m512	EVEX	AVX-512-VP2INTERSECT	Computes intersection of two ZMM registers into mask pair.
vp2intersectq	VP2INTERSECTQ k1+1, zmm2, zmm3/m512	EVEX	AVX-512-VP2INTERSECT	Computes intersection of two ZMM registers into mask pair.
vp4dpwssd	VP4DPWSSD zmm1 {k1}, zmm2+3, m128	EVEX	AVX-512-4VNNIW	Neural Net 4-way dot product.
vp4dpwssds	VP4DPWSSDS zmm1 {k1}, zmm2+3, m128	EVEX	AVX-512-4VNNIW	Neural Net 4-way dot product with saturation.
vpabsd	VPABSD zmm1 {k1}, zmm2/m512	EVEX	AVX-512F	Computes absolute value of 32-bit integers.
vpabsq	VPABSQ zmm1 {k1}, zmm2/m512	EVEX	AVX-512F	Computes absolute value of 64-bit integers.
vpaddb	VPADDB ymm1, ymm2, ymm3/m256	VEX	AVX2	Adds 32 bytes (256-bit).
vpaddd	VPADDD ymm1, ymm2, ymm3/m256	VEX	AVX2	Adds 8 integers (256-bit).
vpbroadcastb	VPBROADCASTB ymm1, xmm2/m8	VEX	AVX2	Broadcasts a byte from memory/register to all elements of YMM.
vpbroadcastd	VPBROADCASTD ymm1, xmm2/m32	AVX2	AVX2	Loads one integer and replicates it to all YMM elements.
vpbroadcastq	VPBROADCASTQ ymm1, xmm2/m64	VEX	AVX2	Broadcasts a quadword to all elements of YMM.
vpbroadcastw	VPBROADCASTW ymm1, xmm2/m16	VEX	AVX2	Broadcasts a word to all elements of YMM.
vpclmulqdq	VPCLMULQDQ zmm1, zmm2, zmm3/m512, imm8	EVEX	AVX-512-VPCLMULQDQ	Carry-less multiply on 512-bit vector.
vpcmov	VPCMOV xmm1, xmm2, xmm3, xmm4	XOP	XOP	Bitwise conditional move based on selector.
vpcmpb	VPCMPB k1 {k2}, zmm2, zmm3/m512, imm8	EVEX	AVX-512BW	Compares bytes and stores result in k-register mask.
vpcmpd	VPCMPD k1 {k2}, zmm2, zmm3/m512, imm8	EVEX	AVX-512F	Compares doublewords and stores result in k-register mask.
vpcmpq	VPCMPQ k1 {k2}, zmm2, zmm3/m512, imm8	EVEX	AVX-512F	Compares quadwords and stores result in k-register mask.
vpcmpub	VPCMPUB k1 {k2}, zmm2, zmm3/m512, imm8	EVEX	AVX-512BW	Compares unsigned bytes and stores result in k-register.
vpcmpud	VPCMPUD k1 {k2}, zmm2, zmm3/m512, imm8	EVEX	AVX-512F	Compares unsigned doublewords and stores result in k-register.
vpcmpuq	VPCMPUQ k1 {k2}, zmm2, zmm3/m512, imm8	EVEX	AVX-512F	Compares unsigned quadwords and stores result in k-register.
vpcmpuw	VPCMPUW k1 {k2}, zmm2, zmm3/m512, imm8	EVEX	AVX-512BW	Compares unsigned words and stores result in k-register.
vpcmpw	VPCMPW k1 {k2}, zmm2, zmm3/m512, imm8	EVEX	AVX-512BW	Compares words and stores result in k-register mask.
vpcomb	VPCOMB xmm1, xmm2, xmm3/m128, imm8	XOP	XOP	Compares bytes using immediate condition.
vpcompressb	VPCOMPRESSB m512 {k1}, zmm1	EVEX	AVX-512-VBMI2	Compresses active bytes from ZMM to memory.
vpcompressw	VPCOMPRESSW m512 {k1}, zmm1	EVEX	AVX-512-VBMI2	Compresses active words from ZMM to memory.
vpconflictd	VPCONFLICTD zmm1 {k1}, zmm2/m512	EVEX	AVX-512CD	Detects duplicate values in a vector (Conflict Detection).
vpconflictq	VPCONFLICTQ zmm1 {k1}, zmm2/m512	EVEX	AVX-512CD	Detects duplicate values in a quadword vector.
vpdpbusd	VPDPBUSD zmm1, zmm2, zmm3/m512	EVEX	AVX-512-VNNI	Dot product of unsigned/signed bytes, accum to dword.
vpdpbusds	VPDPBUSDS zmm1, zmm2, zmm3/m512	EVEX	AVX-512-VNNI	Dot product of unsigned/signed bytes, accum to dword (Saturate).
vpdpwssd	VPDPWSSD zmm1, zmm2, zmm3/m512	EVEX	AVX-512-VNNI	Dot product of signed words, accum to dword.
vpdpwssds	VPDPWSSDS zmm1, zmm2, zmm3/m512	EVEX	AVX-512-VNNI	Dot product of signed words, accum to dword (Saturate).
vperm2f128	VPERM2F128 ymm1, ymm2, ymm3/m256, imm8	VEX	AVX	Shuffles 128-bit float lanes between YMM registers.
vperm2i128	VPERM2I128 ymm1, ymm2, ymm3/m256, imm8	VEX	AVX2	Shuffles two 128-bit lanes between registers.
vpermb	VPERMB zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-VBMI	Permutes bytes in ZMM based on index vector.
vpermd	VPERMD ymm1, ymm2, ymm3/m256	VEX	AVX2	Full permutation of 8 integers using indices from a register.
vpermi2b	VPERMI2B zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-VBMI	Shuffles bytes from two ZMM registers into destination.
vpermi2d	VPERMI2D zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Shuffles doublewords from two ZMM registers into destination.
vpermi2q	VPERMI2Q zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Shuffles quadwords from two ZMM registers into destination.
vpermilpd	VPERMILPD ymm1, ymm2/m256, imm8	AVX	AVX	Shuffles doubles within 128-bit lanes (AVX).
vpermilps	VPERMILPS ymm1, ymm2/m256, imm8	AVX	AVX	Shuffles floats within 128-bit lanes (AVX).
vpermps	VPERMPS ymm1, ymm2, ymm3/m256	VEX	AVX2	Full permutation of 8 floats using indices.
vpermq	VPERMQ ymm1, ymm2/m256, imm8	VEX	AVX2	Shuffles quadwords within 256-bit lanes using immediate.
vpermt2b	VPERMT2B zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-VBMI	Shuffles bytes from two sources, overwriting index.
vpermt2d	VPERMT2D zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Shuffles 2 sources, overwriting the index register.
vpermt2q	VPERMT2Q zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Shuffles 2 sources, overwriting the index register (Quadword).
vpermw	VPERMW zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512BW	Full permutation of 32 words using indices.
vpexpandb	VPEXPANDB zmm1 {k1}, m512	EVEX	AVX-512-VBMI2	Expands bytes from memory into sparse locations in ZMM.
vpexpandw	VPEXPANDW zmm1 {k1}, m512	EVEX	AVX-512-VBMI2	Expands words from memory into sparse locations in ZMM.
vpgatherdd	VPGATHERDD ymm1, [base+ymm_idx*scale], ymm_mask	VEX	AVX2	Gathers 32-bit integers using 32-bit indices.
vpgatherdq	VPGATHERDQ ymm1, [base+xmm_idx*scale], ymm_mask	VEX	AVX2	Gathers 64-bit integers using 32-bit indices.
vpgatherqd	VPGATHERQD xmm1, [base+ymm_idx*scale], xmm_mask	VEX	AVX2	Gathers 32-bit integers using 64-bit indices.
vpgatherqq	VPGATHERQQ ymm1, [base+ymm_idx*scale], ymm_mask	VEX	AVX2	Gathers 64-bit integers using 64-bit indices.
vphaddbd	VPHADDBD xmm1, xmm2/m128	XOP	XOP	Adds adjacent bytes to doublewords.
vphaddbq	VPHADDBQ xmm1, xmm2/m128	XOP	XOP	Adds adjacent bytes to quadwords.
vphaddbw	VPHADDBW xmm1, xmm2/m128	XOP	XOP	Adds adjacent bytes to words.
vphadddq	VPHADDDQ xmm1, xmm2/m128	XOP	XOP	Adds adjacent doublewords to quadwords.
vphaddwd	VPHADDWD xmm1, xmm2/m128	XOP	XOP	Adds adjacent words to doublewords.
vphaddwq	VPHADDWQ xmm1, xmm2/m128	XOP	XOP	Adds adjacent words to quadwords.
vplzcntd	VPLZCNTD zmm1 {k1}, zmm2/m512	EVEX	AVX-512CD	Counts leading zeros for each doubleword element.
vplzcntq	VPLZCNTQ zmm1 {k1}, zmm2/m512	EVEX	AVX-512CD	Counts leading zeros for each quadword element.
vpmacssww	VPMACSSWW xmm1, xmm2, xmm3, xmm4	XOP	XOP	Multiply-accumulate signed words with saturation.
vpmacsww	VPMACSWW xmm1, xmm2, xmm3, xmm4	XOP	XOP	Multiply-accumulate signed words.
vpmadd52huq	VPMADD52HUQ zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-IFMA	Fused multiply-add for 52-bit integers (High 52 bits).
vpmadd52luq	VPMADD52LUQ zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-IFMA	Fused multiply-add for 52-bit integers (Low 52 bits).
vpmaxsq	VPMAXSQ zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Returns maximum of signed 64-bit integers.
vpmaxuq	VPMAXUQ zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Returns maximum of unsigned 64-bit integers.
vpminsq	VPMINSQ zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Returns minimum of signed 64-bit integers.
vpminuq	VPMINUQ zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Returns minimum of unsigned 64-bit integers.
vpmovb2m	VPMOVB2M k1, zmm1	EVEX	AVX-512BW	Moves byte integer mask from ZMM to k-register.
vpmovd2m	VPMOVD2M k1, zmm1	EVEX	AVX-512DQ	Moves doubleword integer mask from ZMM to k-register.
vpmovdb	VPMOVDB xmm1/m128 {k1}, zmm2	EVEX	AVX-512F	Down-converts 32-bit integers to 8-bit.
vpmovm2b	VPMOVM2B zmm1, k1	EVEX	AVX-512BW	Expands k-register bits to byte elements in ZMM.
vpmovm2d	VPMOVM2D zmm1, k1	EVEX	AVX-512DQ	Expands k-register bits to doubleword elements in ZMM.
vpmovm2q	VPMOVM2Q zmm1, k1	EVEX	AVX-512DQ	Expands k-register bits to quadword elements in ZMM.
vpmovm2w	VPMOVM2W zmm1, k1	EVEX	AVX-512BW	Expands k-register bits to word elements in ZMM.
vpmovq2m	VPMOVQ2M k1, zmm1	EVEX	AVX-512DQ	Moves quadword integer mask from ZMM to k-register.
vpmovsqb	VPMOVSQB xmm1/m128 {k1}, zmm2	EVEX	AVX-512F	Down-converts 64-bit integers to 8-bit signed saturate.
vpmovswb	VPMOVSWB xmm1/m128 {k1}, zmm2	EVEX	AVX-512F	Down-converts 16-bit integers to 8-bit signed saturate.
vpmovusdb	VPMOVUSDB xmm1/m128 {k1}, zmm2	EVEX	AVX-512F	Down-converts 32-bit to 8-bit with unsigned saturation.
vpmovusqb	VPMOVUSQB xmm1/m128 {k1}, zmm2	EVEX	AVX-512F	Down-converts 64-bit integers to 8-bit unsigned saturate.
vpmovuswb	VPMOVUSWB xmm1/m128 {k1}, zmm2	EVEX	AVX-512F	Down-converts 16-bit integers to 8-bit unsigned saturate.
vpmovw2m	VPMOVW2M k1, zmm1	EVEX	AVX-512BW	Moves word integer mask from ZMM to k-register.
vpmulld	VPMULLD ymm1, ymm2, ymm3/m256	VEX	AVX2	Multiplies 8 integers (256-bit).
vpmullq	VPMULLQ zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512DQ	Multiplies 64-bit integers and keeps low 64-bit result.
vpmultishiftqb	VPMULTISHIFTQB zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-VBMI	Selects bytes from each 64-bit element based on shift control.
vpopcntb	VPOPCNTB zmm1 {k1}, zmm2/m512	EVEX	AVX-512-BITALG	Counts set bits in each byte.
vpopcntd	VPOPCNTD zmm1 {k1}, zmm2/m512	EVEX	AVX-512-VPOPCNTDQ	Counts set bits in each doubleword element.
vpopcntq	VPOPCNTQ zmm1 {k1}, zmm2/m512	EVEX	AVX-512-VPOPCNTDQ	Counts set bits in each quadword element.
vpopcntw	VPOPCNTW zmm1 {k1}, zmm2/m512	EVEX	AVX-512-BITALG	Counts set bits in each word element.
vprolq	VPROLQ zmm1 {k1}, zmm2, imm8	EVEX	AVX-512F	Rotates 64-bit integers left.
vprolvd	VPROLVD zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Rotates doublewords left by amounts in second vector.
vprolvq	VPROLVQ zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Rotates quadwords left by amounts in second vector.
vprorq	VPRORQ zmm1 {k1}, zmm2, imm8	EVEX	AVX-512F	Rotates 64-bit integers right.
vprorvd	VPRORVD zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Rotates doublewords right by amounts in second vector.
vprorvq	VPRORVQ zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Rotates quadwords right by amounts in second vector.
vprotb	VPROTB xmm1, xmm2/m128, imm8	XOP	XOP	Rotates bytes in XMM register.
vprotd	VPROTD xmm1, xmm2/m128, imm8	XOP	XOP	Rotates doublewords in XMM register.
vprotq	VPROTQ xmm1, xmm2/m128, imm8	XOP	XOP	Rotates quadwords in XMM register.
vprotw	VPROTW xmm1, xmm2/m128, imm8	XOP	XOP	Rotates words in XMM register.
vpshab	VPSHAB xmm1, xmm2/m128, xmm3	XOP	XOP	Shifts bytes arithmetically.
vpshad	VPSHAD xmm1, xmm2/m128, xmm3	XOP	XOP	Shifts doublewords arithmetically.
vpshaq	VPSHAQ xmm1, xmm2/m128, xmm3	XOP	XOP	Shifts quadwords arithmetically.
vpshaw	VPSHAW xmm1, xmm2/m128, xmm3	XOP	XOP	Shifts words arithmetically.
vpshldd	VPSHLDD zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512-VBMI2	Funnel shift left of doublewords.
vpshldq	VPSHLDQ zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512-VBMI2	Funnel shift left of quadwords.
vpshldw	VPSHLDW zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512-VBMI2	Funnel shift left of words.
vpshrdd	VPSHRDD zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512-VBMI2	Funnel shift right of doublewords.
vpshrdq	VPSHRDQ zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512-VBMI2	Funnel shift right of quadwords.
vpshrdw	VPSHRDW zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512-VBMI2	Funnel shift right of words.
vpshufb	VPSHUFB ymm1, ymm2, ymm3/m256	VEX	AVX2	Shuffles 32 bytes based on indices.
vpshufbitqmb	VPSHUFBITQMB k1 {k2}, zmm2, zmm3/m512	EVEX	AVX-512-BITALG	Extracts bits from bytes and packs into a mask register.
vpsllvd	VPSLLVD ymm1, ymm2, ymm3/m256	AVX2	AVX2	Shifts doublewords left by individual counts.
vpsllvq	VPSLLVQ ymm1, ymm2, ymm3/m256	AVX2	AVX2	Shifts quadwords left by individual counts.
vpsravd	VPSRAVD ymm1, ymm2, ymm3/m256	AVX2	AVX2	Shifts doublewords right arithmetic by individual counts.
vpsravq	VPSRAVQ zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Shifts quadwords right arithmetic by individual counts.
vpsrlvd	VPSRLVD ymm1, ymm2, ymm3/m256	AVX2	AVX2	Shifts doublewords right logical by individual counts.
vpsrlvq	VPSRLVQ ymm1, ymm2, ymm3/m256	AVX2	AVX2	Shifts quadwords right logical by individual counts.
vpsubd	VPSUBD ymm1, ymm2, ymm3/m256	VEX	AVX2	Subtracts 8 integers (256-bit).
vpternlogd	VPTERNLOGD zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512F	Performs one of 256 logical operations on 3 inputs.
vpternlogq	VPTERNLOGQ zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512F	Performs one of 256 logical operations on 3 quadwords.
vptestmb	VPTESTMB k1 {k2}, zmm2, zmm3/m512	EVEX	AVX-512BW	Tests byte integers and sets k-register mask.
vptestmd	VPTESTMD k1 {k2}, zmm2, zmm3/m512	EVEX	AVX-512F	Tests doubleword integers and sets k-register mask.
vptestmq	VPTESTMQ k1 {k2}, zmm2, zmm3/m512	EVEX	AVX-512F	Tests quadword integers and sets k-register mask.
vptestmw	VPTESTMW k1 {k2}, zmm2, zmm3/m512	EVEX	AVX-512BW	Tests word integers and sets k-register mask.
vrangeps	VRANGEPS zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512DQ	Calculates range (min/max/abs) of float values.
vrangess	VRANGESS xmm1 {k1}, xmm2, xmm3/m32, imm8	EVEX	AVX-512DQ	Calculates range (min/max/abs) of low float.
vrcp14ps	VRCP14PS zmm1 {k1}, zmm2/m512	EVEX	AVX-512F	Approximate 1/x with 2^-14 error.
vreduceps	VREDUCEPS zmm1 {k1}, zmm2/m512, imm8	EVEX	AVX-512DQ	Performs reduction on floats (e.g. range reduction for trig).
vreducess	VREDUCESS xmm1 {k1}, xmm2, xmm3/m32, imm8	EVEX	AVX-512DQ	Performs reduction on low float.
vrndscalepd	VRNDSCALEPD zmm1 {k1}, zmm2/m512, imm8	EVEX	AVX-512F	Rounds doubles to integer values using imm8 control.
vrsqrt14ps	VRSQRT14PS zmm1 {k1}, zmm2/m512	EVEX	AVX-512F	Approximate 1/sqrt(x) with 2^-14 error.
vscalefpd	VSCALEFPD zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512F	Scales doubles by exponents (x * 2^n).
vscatterdpd	VSCATTERDPD [base+zmm_idx*scale] {k1}, zmm1	EVEX	AVX-512F	Stores doubles to non-contiguous memory locations.
vscatterdps	VSCATTERDPS [base+zmm_idx*scale] {k1}, zmm1	EVEX	AVX-512F	Stores floats to non-contiguous memory locations.
vscatterpf0dpd	VSCATTERPF0DPD {k1}, [base+ymm_idx]	EVEX	AVX-512PF	Prefetches lines for scatter write (L1, Double).
vscatterpf0dps	VSCATTERPF0DPS {k1}, [base+zmm_idx]	EVEX	AVX-512PF	Prefetches cache lines for scatter write (L1).
vscatterpf0qpd	VSCATTERPF0QPD {k1}, [base+zmm_idx]	EVEX	AVX-512PF	Prefetches lines for scatter write (L1, Double, 64-bit idx).
vscatterpf0qps	VSCATTERPF0QPS {k1}, [base+zmm_idx]	EVEX	AVX-512PF	Prefetches lines for scatter write (L1, 64-bit idx).
vscatterqpd	VSCATTERQPD [base+zmm_idx*scale] {k1}, zmm1	EVEX	AVX-512F	Stores doubles using 64-bit indices.
vscatterqps	VSCATTERQPS [base+zmm_idx*scale] {k1}, zmm1	EVEX	AVX-512F	Stores floats using 64-bit indices.
vsha512msg1	VSHA512MSG1 ymm1, xmm2	EVEX	SHA512	SHA512 intermediate calculation (AVX512).
vsha512msg2	VSHA512MSG2 ymm1, ymm2	EVEX	SHA512	SHA512 final calculation (AVX512).
vsha512rnds2	VSHA512RNDS2 ymm1, ymm2, xmm3	EVEX	SHA512	SHA512 2 rounds calculation (AVX512).
vshuff32x4	VSHUFF32X4 zmm1 {k1}, zmm2, zmm3/m512, imm8	EVEX	AVX-512F	Shuffles 128-bit blocks of single-precision floats.
vsm3msg1	VSM3MSG1 xmm1, xmm2, xmm3	VEX	SM3	SM3 crypto message schedule part 1.
vsm3rnds2	VSM3RNDS2 xmm1, xmm2, imm8	VEX	SM3	SM3 crypto 2 rounds.
vsm4key4	VSM4KEY4 xmm1, xmm2	VEX	SM4	SM4 key generation.
vsm4rnds4	VSM4E xmm1, xmm2	VEX	SM4	SM4 crypto encryption round.
vsqrtph	VSQRTPH zmm1 {k1}, zmm2/m512	EVEX	AVX-512-FP16	Square root of half-precision values.
vsqrtsh	VSQRTSH xmm1 {k1}, xmm2/m16	EVEX	AVX-512-FP16	Square root of low FP16 value.
vsubph	VSUBPH zmm1 {k1}, zmm2, zmm3/m512	EVEX	AVX-512-FP16	Subtracts half-precision floating-point values.
vsubsh	VSUBSH xmm1 {k1}, xmm2, xmm3/m16	EVEX	AVX-512-FP16	Subtracts low FP16 value.
vtestpd	VTESTPD xmm1, xmm2/m128	AVX	AVX	Sets ZF/CF based on sign bit comparisons of doubles.
vtestps	VTESTPS xmm1, xmm2/m128	AVX	AVX	Sets ZF/CF based on sign bit comparisons of floats.
vzeroall	VZEROALL	VEX	AVX	Clears all YMM registers.
vzeroupper	VZEROUPPER	VEX	AVX	Clears bits 128-255 of all YMM registers (Avoids AVX-SSE transition penalty).
wait	WAIT	Legacy	Base	Wait for FPU (same as FWAIT).
wbinvd	WBINVD	System	System	Writes back modified data and invalidates caches (Privileged).
wbnoinvd	WBNOINVD	Legacy	WBNOINVD	Writes back modified lines but keeps them valid in cache.
wrfsbase	WRFSBASE r64	Legacy	FSGSBASE	Writes a register to the FS base address.
wrgsbase	WRGSBASE r64	Legacy	FSGSBASE	Writes a register to the GS base address.
wrmsr	WRMSR	System	System	Writes EDX:EAX to MSR specified by ECX (Privileged).
wrpkru	WRPKRU	Legacy	PKU	Writes EAX/EDX to PKRU register.
xabort	XABORT imm8	Legacy	RTM (TSX)	Forces an RTM abort.
xadd	XADD r/m, r	Legacy	Base	Exchanges dest and src, then loads sum into dest.
xbegin	XBEGIN rel	Legacy	RTM (TSX)	Specifies start of Restricted Transactional Memory region.
xchg	XCHG r/m, r	Legacy	Base	Exchanges content of two operands.
xend	XEND	Legacy	RTM (TSX)	Specifies end of RTM region.
xgetbv	XGETBV	Legacy	XSAVE	Reads the state of XCR0 (feature mask) into EDX:EAX.
xlat	XLAT m8	Legacy	Base	Replaces AL with byte from table at [EBX+AL].
xor	XOR r/m, r	Legacy	Base	Performs bitwise XOR.
xorps	XORPS xmm, xmm/m128	SSE	SSE	Bitwise XOR of 128 bits (Used to clear registers).
xrstor	XRSTOR m	Legacy	XSAVE	Restores specified state components from memory.
xrstors	XRSTORS m	Legacy	XSAVES	Restores supervisor state components from memory (Compact).
xsave	XSAVE m	Legacy	XSAVE	Saves specified state components (AVX, SSE, etc.) to memory.
xsavec	XSAVEC m	Legacy	XSAVEC	Saves state components using compaction.
xsaveopt	XSAVEOPT m	Legacy	XSAVEOPT	Saves state components (optimized for Modified state).
xsaves	XSAVES m	Legacy	XSAVES	Saves supervisor state components to memory (Compact).
xsetbv	XSETBV	Legacy	XSAVE	Writes EDX:EAX to XCR0 (Enables/disables AVX/SSE states).
xsusldtrk	TSXLDTRK	Legacy	TSXLDTRK	Suspends/Resumes tracking of load operations in TSX.
xtest	XTEST	Legacy	TSX	Sets ZF if processor is in transactional region.