pmxvbf16ger2
Prefixed Masked VSX Vector BFloat16 Ger (Rank-2 Update)
Matrix Multiply Assist (MMA) instruction. Computes ACC <- ACC + (A * B) using BF16 inputs.
Details
The pmxvbf16ger2 instruction performs a masked vector bfloat16 general matrix multiply-accumulate (GER) operation. It multiplies corresponding elements of two vectors, XA and XB, based on the mask specified by PMSK, and accumulates the results into an accumulator register AT. The operation is performed for each pair of word elements in the vectors, with saturation handling if the result exceeds the 32-bit signed integer range.
Pseudocode Operation
Matrix Multiply Accumulate (BF16)
Programming Note
The pmxvbf16ger2 instruction is useful for performing matrix operations on bfloat16 data types with masking, allowing selective computation based on a mask. Ensure that the input vectors and accumulator are properly aligned to avoid performance penalties. This instruction operates at the user privilege level and will raise an exception if the result exceeds the 32-bit signed integer range, requiring saturation handling.
Example
Encoding
Operands
-
AT
Accumulator (0-7) -
XA
Vector A -
XB
Vector B -
XMSK
Mask for A -
YMSK
Mask for B