The SSE4A instructions EXTRQ/INSERTQ only use the lower 64-bits (or less) for many of their input vector operands and all of them have undefined upper 64-bits results.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
934–935 ↗ | (On Diff #34170) | Use 'auto *' for pointers? Same comment for the other autos below. |
972–986 ↗ | (On Diff #34170) | Isn't this identical to the extrqi case? If yes, combine cases and eliminate the duplicated code. Could also just add a 3-4 line helper function that can be used for all of the cases? |
I'll get this updated - thanks for looking at this, I know SSE4A isn't the most exciting of instructions to deal with!!
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
934–935 ↗ | (On Diff #34170) | No problem. |
972–986 ↗ | (On Diff #34170) | I'll look into creating a helper (this is one of the most common patterns in this entire file....). I'm intending to add support for decoding EXTRQI to shuffles soon so wish to keep the actual cases separate. |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
936–937 ↗ | (On Diff #34170) | I think it'd be more concise to use Op->getType()->getVectorNumElements() |
Updated - thanks for the comments. If you chaps are happy with the SimplifyDemandedVectorEltsLow helper I can convert other cases to use it in future commits.
Can we hoist the getVectorNumElements() and associated assert into the helper as well? If yes, please do. And certainly now that we have a helper, let's use it to reduce the code bloat wherever possible.
LGTM.
Thanks Sanjay, I'll look at hoisting the getVectorNumElements() when I update the other functions to use the helper - probably over the weekend.