PPT Slide
c7: ldh .D2 *B5++,B4 || ldh .D1 *A0++,A3 || [B0] sub .L2 B0,1,B0 || [B0] B .S2 c7 || mpy .M1x B4,A3,A4 || add .L1 A4,A5,A5
Notes:
We can see that this is a completely optimized piece of code. The tools have eliminated all no NOP instructions, and we have a single-cycle loop executing six instructions.
We also notice a .M1x. What this is showing is that we can multiply data from the B register file and the A register file together. The X indicates a cross path. So I'm multiplying a B register by an A register and I put the result in the A register. In fact, we could have up to two cross paths: one from B to A and one from A to B.