This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX] Improve vXi64 UITOFP vXf64/vXf32 support (P38226/PR38970)
AbandonedPublic

Authored by RKSimon on Oct 6 2018, 4:13 PM.

Details

Summary

An initial attempt to try and improve vXi64 UITOFP conversions:

vXi64-vXf64 - perform this as a true vectorization instead of (partially vectorized) scalar conversions by adding vector support to ExpandLegalINT_TO_FP)
vXi64-vXf32 - SSE customized versions of the ExpandLegalINT_TO_FP code, avoiding a lot of branches that were often poorly predicted

There's still room for improvement:

uitofp_4i64_to_4f64 - AVX1 codegen should be able to perform the vpsrlq xmm shifts as ymm (v8f32) shuffles
uitofp_Xi64_to_Xf32 - some of the BLENDV cases should be selected from the sign bit directly and not need a shift/comparison

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Oct 6 2018, 4:13 PM
craig.topper added inline comments.Oct 6 2018, 4:53 PM
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2415

So this code was guaranteed unreachable due to the assert? Are we sure its correct for signed? That assert has been there since 2006 and Owen Anderson added the algorithm for __floatundidf in 2010. So this code might actually be a different version of unsigned handling.

craig.topper added inline comments.Oct 6 2018, 5:34 PM
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2362

I think we really need to handle this in the expand code in LegalizeVectorOps. We hit the expand there first which scalarizes it. Then DAG combine reassembled it in reduceBuildVecConvertToConvertBuildVec. The we hit this code in LegalizeDAG. But I don't think we really want to rely on DAG combine like that.

craig.topper added inline comments.Oct 6 2018, 5:50 PM
lib/Target/X86/X86ISelLowering.cpp
17122

is64Bit should be here as well I think?

RKSimon abandoned this revision.Jan 6 2020, 2:47 AM

@craig.topper's recent patches cover this

Herald added a project: Restricted Project. · View Herald TranscriptJan 6 2020, 2:47 AM