This is an archive of the discontinued LLVM Phabricator instance.

[X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul
ClosedPublic

Authored by craig.topper on Nov 14 2016, 11:03 PM.

Details

Summary

These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper retitled this revision from to [X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul.
craig.topper updated this object.
craig.topper added reviewers: zvi, RKSimon, delena.
craig.topper added a subscriber: llvm-commits.
delena added inline comments.Nov 14 2016, 11:09 PM
lib/Target/X86/X86InstrSSE.td
3088 ↗(On Diff #77948)

why do you need to call basic_sse12_fp_binop_s_int for ADD if you don't have any intrinsic?

Cause I'm dumb. That's way simpler to just remove it. Guess I was thinking there was more inheritance there than there is.

Wait. Nevermind. We need the instrinsic instructions to still exist because they are referenced by scalar_math_f32_patterns and scalar_math_f64_patterns. The _Int instrucctions are still used for the actual lowering of the patterns used by the upgrade and by the clang intrinsics

zvi added inline comments.Nov 14 2016, 11:44 PM
test/CodeGen/X86/vec_ss_load_fold.ll
41 ↗(On Diff #77948)

This redundant blend should be documented in Bugzilla. It would be best to fix this before committing this patch.

craig.topper added inline comments.Nov 15 2016, 12:03 AM
test/CodeGen/X86/vec_ss_load_fold.ll
41 ↗(On Diff #77948)

That blend exists because there is a vzmovl created from the inserts of 0s that pushed up to here and was then blocked by the min/max nodes. I can't pattern match it out.

We need some sort of demanded elements filtering that figures out vcvttss2si doesn't want the upper bits and that the min/max pass the bits straight through and thus don't want the bits either. And push that all the way back to remove the original insert elements. Or something like that.

I'll file a bug, but I don' think it should block a patch that was just trying to remove an intrinsic that clang doesn't use. I could write this same test case in clang without this instrinsic and see the same extra blend.

zvi added inline comments.Nov 15 2016, 12:13 AM
test/CodeGen/X86/vec_ss_load_fold.ll
41 ↗(On Diff #77948)

Thanks

delena accepted this revision.Nov 15 2016, 1:14 AM
delena edited edge metadata.
This revision is now accepted and ready to land.Nov 15 2016, 1:14 AM
RKSimon edited edge metadata.Nov 15 2016, 4:28 AM

combineTargetShuffle has some old canonicalization code for v2f64 BLENDs, supposedly for scalar folding. This could be a good place to start.

This revision was automatically updated to reflect the committed changes.