Currently a vector move of 0 or -1 will use different instructions depending on the size of the vector. Using a single instruction (the 128-bit one) for both gives more opportunity for Machine CSE to eliminate instructions.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
Comment Actions
LGTM.
I was thinking about how this might affect other little cores like the A53/A55, especially around the dual issue on q registers. I don't think it will make much difference though, and the CSE benefits look like a bigger win.