This is an archive of the discontinued LLVM Phabricator instance.

[LV] Account for the cost of predication of scalarized load/store
ClosedPublic

Authored by dmgreen on Mar 9 2021, 2:44 AM.

Details

Summary

This adds the cost of an i1 extract and a branch to the cost in getMemInstScalarizationCost when the instruction is predicated. These predicated loads/store would generate blocks of something like:

  %c1 = extractelement <4 x i1> %C, i32 1
  br i1 %c1, label %if, label %else
if:
  %sa = extractelement <4 x i32> %a, i32 1
  %sb = getelementptr inbounds float, float* %pg, i32 %sa
  %sv = extractelement <4 x float> %x, i32 1
  store float %sa, float* %sb, align 4
else:

So this increases the cost by the extract and branch. This is probably still too low in many cases due to the cost of all that branching, but there is already an existing hack increasing the cost using useEmulatedMaskMemRefHack. It will increase the cost if it is a load or there are more than one store. This patch improves the cost for when there is only a single store improving the attached test.

(The hack can hopefully be removed at some point. I think it would be OK to remove from the ARM testing I tried, but there are a number of X86 tests that suggest it's still needed, and the cost of an i1 extract + branch is probably too low to accurately represent the cost of trying to perform all these branches).

Diff Detail

Event Timeline

dmgreen created this revision.Mar 9 2021, 2:44 AM
dmgreen requested review of this revision.Mar 9 2021, 2:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2021, 2:44 AM
SjoerdMeijer accepted this revision.Mar 16 2021, 6:57 AM

This looks very reasonable to me.

PS Nothing wrong with a descriptive function name: useEmulatedMaskMemRefHack:-)

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6769

Nit: I see there's precedent of not doing it in this file, but

.., false /* Insert */, true /* Extract */);

would have helped readability for me.

This revision is now accepted and ready to land.Mar 16 2021, 6:57 AM
This revision was landed with ongoing or failed builds.Mar 17 2021, 3:58 AM
This revision was automatically updated to reflect the committed changes.