This patch exploits the following instructions:
mtvsrws
lxvwsx
mtvsrdd
mfvsrld
In order to improve some build_vector and extractelement patterns.
Differential D21135
Power9 Instructions for build_vector improvements nemanjai on Jun 8 2016, 7:04 AM. Authored by
Details
Diff Detail
Event Timeline
Comment Actions Added the missing check for only one use of the load when deciding whether to eliminate the splat when building a vector of i32's on Power9.
Comment Actions As we discussed, before you commit the change, please add -verify-machineinstrs to your regression tests. No need to upload the patch again. Thanks. Comment Actions Some of the new instructions were being emitted for unintended code patterns (such as materializing a vector of zeros). The new sequences were inferior so this update ensures that we emit the better code sequence. For example, due to the "AddedComplexity", the initial patch emitted a load-immediate followed by a direct move for materializing ones or zeros into a vector. A vector of zeros can be produced with a single XXLXOR. A vector of ones can be produced by a splat-immediate (especially now that we have a VSX version of it). This patch was functionally tested on the Power9 simulator. Comment Actions This is perhaps minor, but we should rethink the change in PPCInstPrinter.cpp. If this change is needed, then we should change all the print routines in a similar manner.
Comment Actions Updated the truncation of the 32-bit unsigned value to 8-bits in PPCInstrPrinter.cpp. |
I'm not sure about this change.
Why are we printing as unsigned int, instead of unsigned char?
It seems like this method, and the method above (printU7ImmOperand) should be using (unsigned char) instead of (unsigned int). It looks like this was done with the printU10ImmOperand below (and probably others, but I didn't look exhaustively).