This is an archive of the discontinued LLVM Phabricator instance.

[X86] Disable f32->f64 extload when sse2 is enabled
ClosedPublic

Authored by craig.topper on May 30 2019, 5:28 PM.

Details

Summary

We can only use the memory form of cvtss2sd under optsize due to a partial register update. So previously we were emitting 2 instructions for extload when optimizing for speed. Also due to a late optimization in preprocessiseldag we had to handle (fpextend (loadf32)) under optsize.

This patch forces extload to expand so that it will always be in the (fpextend (loadf32)) form during isel. And when optimizing for speed we can just let each of those pieces select an instruction independently.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.May 30 2019, 5:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2019, 5:28 PM
Herald added a subscriber: hiraditya. · View Herald Transcript

Add a comment above the setLoadExtAction call

Do we have all the necessary test coverage for these cases?

Removing the "(fpextend (loadf32 addr:$src))" pattern without this patch doesn't fail any tests. Not sure if the peephole pass kicks in to hide it.
Removing the (extloadf32 addr:$src) without this patch fails isel since the extload won't match

With the change to X86ISelLowering.cpp from this patch applied and the (fpextend (loadf32 addr:$src)) removed, we fail to fold some loads in sse2-intrinsics-upgrade-x86.ll because it explicitly disables the peephole pass. brcond.ll. If I also remove the load folding table entries sse_partial_update.ll also fails. As well as stack folding and fast isel but those are to be expected since they depend exclusively on the table.

RKSimon accepted this revision.Jun 9 2019, 2:40 PM

OK, let's got for it - cheers, LGTM

This revision is now accepted and ready to land.Jun 9 2019, 2:40 PM
This revision was automatically updated to reflect the committed changes.