This is an archive of the discontinued LLVM Phabricator instance.

[X86] Improve v2i64->v2f32 and v4i64->v4f32 uint_to_fp on avx and avx2 targets.
ClosedPublic

Authored by craig.topper on Dec 28 2019, 12:25 AM.

Details

Summary

Based on Simon's D52965, but improved to handle strict fp and improve some of the shuffling.

Rather than use v2i1/v4i1 and let type legalization continue, just generate all the code with legal types and use an explicit shuffle.

I also added an explicit setcc to the v4i64 code to match the semantics of vselect which doesn't just use the sign bit. I'm also using a v4i64->v4i32 truncate instead of the shuffle in Simon's original code. With the setcc this will become a pack.

Future work can look into using X86ISD::BLENDV and a different shuffle that only moves the sign bit.

Diff Detail

Event Timeline

craig.topper created this revision.Dec 28 2019, 12:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 28 2019, 12:25 AM
Herald added a subscriber: hiraditya. · View Herald Transcript

Fix comparison polarity. We were checking 0 < Src instead of Src < 0.

Add v4i64->v4f32 as well

craig.topper retitled this revision from [X86] Improve v2i64->v2f32 uint_to_fp on avx and avx2 targets. to [X86] Improve v2i64->v2f32 and v4i64->v4f32 uint_to_fp on avx and avx2 targets..Dec 28 2019, 11:56 AM
craig.topper edited the summary of this revision. (Show Details)
craig.topper edited the summary of this revision. (Show Details)
craig.topper edited the summary of this revision. (Show Details)
RKSimon added inline comments.Dec 29 2019, 2:46 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
29105

DAG.getBitcast

llvm/test/CodeGen/X86/vec-strict-inttofp-256.ll
1062

AVX1 is using vpcmpgtq for ymm ?

craig.topper marked an inline comment as done.Dec 29 2019, 2:54 AM
craig.topper added inline comments.
llvm/test/CodeGen/X86/vec-strict-inttofp-256.ll
1062

Looks like the llc part of the command line is really avx2

Address review comment and rebase after fixing test run lines.

RKSimon accepted this revision.Jan 5 2020, 12:34 AM

LGTM - are the TTI costs still accurate for the new codegen?

llvm/lib/Target/X86/X86ISelLowering.cpp
29075

Can you raise a bug+testcase for this please?

This revision is now accepted and ready to land.Jan 5 2020, 12:34 AM
This revision was automatically updated to reflect the committed changes.