These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Target/X86/X86InstrSSE.td | ||
---|---|---|
3088 ↗ | (On Diff #77948) | why do you need to call basic_sse12_fp_binop_s_int for ADD if you don't have any intrinsic? |
Cause I'm dumb. That's way simpler to just remove it. Guess I was thinking there was more inheritance there than there is.
Wait. Nevermind. We need the instrinsic instructions to still exist because they are referenced by scalar_math_f32_patterns and scalar_math_f64_patterns. The _Int instrucctions are still used for the actual lowering of the patterns used by the upgrade and by the clang intrinsics
test/CodeGen/X86/vec_ss_load_fold.ll | ||
---|---|---|
41 ↗ | (On Diff #77948) | This redundant blend should be documented in Bugzilla. It would be best to fix this before committing this patch. |
test/CodeGen/X86/vec_ss_load_fold.ll | ||
---|---|---|
41 ↗ | (On Diff #77948) | That blend exists because there is a vzmovl created from the inserts of 0s that pushed up to here and was then blocked by the min/max nodes. I can't pattern match it out. We need some sort of demanded elements filtering that figures out vcvttss2si doesn't want the upper bits and that the min/max pass the bits straight through and thus don't want the bits either. And push that all the way back to remove the original insert elements. Or something like that. I'll file a bug, but I don' think it should block a patch that was just trying to remove an intrinsic that clang doesn't use. I could write this same test case in clang without this instrinsic and see the same extra blend. |
test/CodeGen/X86/vec_ss_load_fold.ll | ||
---|---|---|
41 ↗ | (On Diff #77948) | Thanks |
combineTargetShuffle has some old canonicalization code for v2f64 BLENDs, supposedly for scalar folding. This could be a good place to start.