xvi8ger4pp
VSX Vector Integer 8-bit GER (Rank-4 Update) Plus/Plus
Signed/Unsigned variations of 8-bit matrix multiply accumulate.
Details
The xvi8ger4pp instruction performs a rank-4 update of the accumulator ACC[AT] by accumulating the outer products of 8-bit signed integers from VSR[XA] and VSR[XB], and then adding the current value of ACC[AT][i][j]. The result is chopped to fit into a 32-bit signed integer.
Pseudocode Operation
ACC[AT][i][j] ← si32_CHOP( EXTS(X[i][0]) * EXTS(Y[j][0]) +
EXTS(X[i][1]) * EXTS(Y[j][1]) +
EXTS(X[i][2]) * EXTS(Y[j][2]) +
EXTS(X[i][3]) * EXTS(Y[j][3]) +
EXTS(X[i][4]) * EXTS(Y[j][4]) +
EXTS(X[i][5]) * EXTS(Y[j][5]) +
EXTS(X[i][6]) * EXTS(Y[j][6]) +
EXTS(X[i][7]) * EXTS(Y[j][7]) +
EXTS(ACC[AT][i][j]) )
Programming Note
This instruction is commonly used in matrix operations where rank-4 updates are required. Ensure that the input vectors VSR[XA] and VSR[XB] are properly aligned to avoid performance penalties. The result is automatically chopped to fit into a 32-bit signed integer, so be cautious of overflow if intermediate results exceed this range.
Example
Encoding
Operands
-
AT
Accumulator -
XA
Src A -
XB
Src B