xvi4ger8

VSX Vector Integer 4-bit GER (Rank-8 Update)

xvi4ger8 AT, XA, XB

Performs an accumulation of eight outer products (rank 8 update) using signed 4-bit integers from two vector scalar registers.

Details

The instruction multiplies corresponding elements of the matrices X and Y, accumulates the results, and stores them in the accumulator ACC[AT]. The result is chopped to a 32-bit signed integer.

Pseudocode Operation

if MSR.VSX=0 then VSX_Unavailable()

PMSK ←0b11111111
XMSK ←0b1111
YMSK ←0b1111

do i = 0 to 3
do j = 0 to 3
   if XMSK.bit[i] & YMSK.bit[j] then do
      prod0 ←(PMSK.bit[0]=0) ? 0 : EXTS(VSR[32×AX+A].word[i].nibble[0]) * EXTS(VSR[32×BX+B].word[j].nibble[0])
      prod1 ←(PMSK.bit[1]=0) ? 0 : EXTS(VSR[32×AX+A].word[i].nibble[1]) * EXTS(VSR[32×BX+B].word[j].nibble[1])
      prod2 ←(PMSK.bit[2]=0) ? 0 : EXTS(VSR[32×AX+A].word[i].nibble[2]) * EXTS(VSR[32×BX+B].word[j].nibble[2])
      prod3 ←(PMSK.bit[3]=0) ? 0 : EXTS(VSR[32×AX+A].word[i].nibble[3]) * EXTS(VSR[32×BX+B].word[j].nibble[3])
      prod4 ←(PMSK.bit[4]=0) ? 0 : EXTS(VSR[32×AX+A].word[i].nibble[4]) * EXTS(VSR[32×BX+B].word[j].nibble[4])
      prod5 ←(PMSK.bit[5]=0) ? 0 : EXTS(VSR[32×AX+A].word[i].nibble[5]) * EXTS(VSR[32×BX+B].word[j].nibble[5])
      prod6 ←(PMSK.bit[6]=0) ? 0 : EXTS(VSR[32×AX+A].word[i].nibble[6]) * EXTS(VSR[32×BX+B].word[j].nibble[6])
      prod7 ←(PMSK.bit[7]=0) ? 0 : EXTS(VSR[32×AX+A].word[i].nibble[7]) * EXTS(VSR[32×BX+B].word[j].nibble[7])

      psum ←prod0 + prod1 + prod2 + prod3 + prod4 + prod5 + prod6 + prod7

      ACC[AT][i].word[j] ←CHOP32( psum )
   end
   else
      ACC[AT][i].word[j] ←0x0000_0000
end
end

Programming Note

Let X be the 8×4 matrix of 4-bit signed integer values contained in VSR[XA] in row-major format. Let Y be the 8×4 matrix of 4-bit signed integer values contained in VSR[XB] in row-major format. Let ACC[AT] be the Accumulator containing a 4×4 matrix of 32-bit signed-integer values.

Example

xvi4ger8 acc0, vs2, vs3

Encoding

Binary Layout

Format XX3-form

Opcode 0xF0000022

Extension MMA

Registers Altered MSR

Operands

AT
Accumulator
XA
Src A (4-bit)
XB
Src B (4-bit)