This is an archive of the discontinued LLVM Phabricator instance.

[X86] Remove X86ISD::FILD_FLAG and stop gluing nodes together.
ClosedPublic

Authored by craig.topper on Jan 15 2020, 1:17 PM.

Details

Summary

I think whatever problem the gluing was fixing has long since been fixed. We don't have any of the restrictions on FP stack stuff that existed back when this was first added.

I had to change which type we use for FILD in BuildFILD when X86 was enabled because most of the isel patterns block f32/f64 instructions when SSE1/SSE2 are enabled. So I needed to use the f80 pattern, but this shouldn't have an effect the generated code since there is only one FILD instruction anyway. We already use f80 explicitly in other other places.

Diff Detail

Event Timeline

craig.topper created this revision.Jan 15 2020, 1:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 15 2020, 1:17 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
RKSimon added inline comments.Jan 17 2020, 5:41 AM
llvm/test/CodeGen/X86/vec-strict-inttofp-256.ll
659

Its annoying that the update scripts hides the pointer math for stack accesses - it'd be good to check what aliasing is occurring. But it should be OK.

Rebase. Regenerate with a hacked script to show the real stack offsets for reviewing. I'll regenerate before commiting so the next person who runs the script won't get a surprise.

craig.topper marked an inline comment as done.Jan 18 2020, 1:01 AM
craig.topper added inline comments.
llvm/test/CodeGen/X86/vec-strict-inttofp-256.ll
636

I wonder if it would make sense to parallelize this. I think we can shift the v4i64 right by 32, trunc to v4i32 use sitofp to convert that part to double. Multiply that by 2^32 in double. That should all be lossless.

Then for the bottom 32 bits we can mask with 0xffffffff. OR with the double representation for 2^52. Then subtract 2^52 from it. This should also be lossless.

Then we just add the two double vectors together which should be the only part that does any rounding.

RKSimon accepted this revision.Jan 18 2020, 6:56 AM
RKSimon added subscribers: scanon, andrew.w.kaylor.

Cheers Craig, TBH I'm left wondering whether we should tweak the update script to always keep stack arithmetic for x86 - do you think they'd be too much churn?

Anyway, LGTM as a first step, but it highlights a number of topics for further possible work (nothing new there......).

llvm/test/CodeGen/X86/vec-strict-inttofp-256.ll
636

@scanon or @andrew.w.kaylor should be able to confirm but that sounds alright to me

llvm/test/CodeGen/X86/vec-strict-inttofp-512.ll
425–426

Its a pity this doesn't spill to stack in order, which would've allowed us to load ymm0 with a single vmovups - lack of scalar/vector stack aliasing awareness has come up in some other bugs IIRC

This revision is now accepted and ready to land.Jan 18 2020, 6:56 AM

Cheers Craig, TBH I'm left wondering whether we should tweak the update script to always keep stack arithmetic for x86 - do you think they'd be too much churn?

Would probably cause a lot of churn in the near term, but we regularly regenerate tests before patches so maybe not a big deal? Another option might be to just add an option to disable like the x86_scrub_rip/no_x86_scrub_rip option?

Anyway, LGTM as a first step, but it highlights a number of topics for further possible work (nothing new there......).

This revision was automatically updated to reflect the committed changes.