This isn't ready to commit yet:
- there are a few code quality regressions in the test cases
- it provokes a crash on test/CodeGen/AMDGPU/global-extload-i16.ll which I haven't been able to fix yet
- See inline comments on the regressions.
- The crash is fixed. See D87757.
There are some regressions in this file but also some improvements. I haven't worked out what's going on yet.
Regression here and in other cases that are now using muls instead of umull/umlal.
Regression. Quite a few tests are now using pxor+punpckhdq instead of pshufd. I wonder if some kind of combine could spot this case and turn it back into pshufd.
Regression. Perhaps we need better known bits analysis.