pmxvbf16ger2

Prefixed Masked VSX Vector BFloat16 Ger (Rank-2 Update)

pmxvbf16ger2 AT, XA, XB, XMSK, YMSK

Matrix Multiply Assist (MMA) instruction. Computes ACC <- ACC + (A * B) using BF16 inputs.

Details

The pmxvbf16ger2 instruction performs a masked vector bfloat16 general matrix multiply-accumulate (GER) operation. It multiplies corresponding elements of two vectors, XA and XB, based on the mask specified by PMSK, and accumulates the results into an accumulator register AT. The operation is performed for each pair of word elements in the vectors, with saturation handling if the result exceeds the 32-bit signed integer range.

Pseudocode Operation

Matrix Multiply Accumulate (BF16)

Programming Note

The pmxvbf16ger2 instruction is useful for performing matrix operations on bfloat16 data types with masking, allowing selective computation based on a mask. Ensure that the input vectors and accumulator are properly aligned to avoid performance penalties. This instruction operates at the user privilege level and will raise an exception if the result exceeds the 32-bit signed integer range, requiring saturation handling.

Example

pmxvbf16ger2 0, 1, 2, 0, 0

Encoding

Binary Layout
1
0
3
6
PMSK
8
XMSK
9
YMSK
14
0
32
59
38
AT
41
/
43
XA
48
XB
53
XO
56
AX
57
BX
58
/
 
Format MMIRR-form
Opcode 0x06000000
Extension Prefixed

Operands

  • AT
    Accumulator (0-7)
  • XA
    Vector A
  • XB
    Vector B
  • XMSK
    Mask for A
  • YMSK
    Mask for B