Currently this is to mainly to prevent scalarization of integer division by constants.
Details
- Reviewers
spatel qcolombet andreadb congh - Commits
- rGe4dbeb40c6a0: [X86][AVX] Enabled MULHS/MULHU v16i16 vectors on AVX1 targets
rG3eef33a80656: [X86][SSE] Add MULHS/MULHU custom lowering for i8 vectors
rL264512: [X86][AVX] Enabled MULHS/MULHU v16i16 vectors on AVX1 targets
rL264511: [X86][SSE] Add MULHS/MULHU custom lowering for i8 vectors
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
19053 ↗ | (On Diff #51137) | Right now, Subtarget.hasInt256() should be true for 256 bit vector, since we add the custom lowering for those only when this predicate is true. |
19078 ↗ | (On Diff #51137) | This returns the low value, right? |
19094 ↗ | (On Diff #51137) | A few comments on how we achieve the extraction wouldn’t hurt :). E.g., Fill the 8 high bits of a v8i16 vector with the low input vector. |
Revised based on Quentin's feedback.
Enabled v32i8 and v16i16 MULHS/MULHU custom lowering on AVX1 targets to prevent scalarization - I can commit the v16i16 support separately if necessary.
Improved the AVX 32i8 lowerings to make use ymm PACKUS (saves 2cy on my Carrizo tests) and added a missing shift to the v16i8 version (I had stupidly only locally tested 32i8 on AVX2 hardware).
Tried to improve comments on what is going on.
Hi Simon,
LGTM. Couple of nits inlined.
Cheers,
-Quentin
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
19094 ↗ | (On Diff #51350) | Add that unlike the smaller PACKUS, the ymm variant interleaves the 128 bits of the both sources. |
19117 ↗ | (On Diff #51350) | Period. |
19133 ↗ | (On Diff #51350) | Period. |