This is an archive of the discontinued LLVM Phabricator instance.

[LV] Add analysis remark for mixed precision conversions
ClosedPublic

Authored by jhuber6 on Jan 27 2021, 9:21 AM.

Details

Summary

Floating point conversions inside vectorized loops have performance
implications but are very subtle. The user could specify a floating
point constant, or call a function without realizing that it will
force a change in the vector width. An example of this behaviour is
seen in https://godbolt.org/z/M3nT6c . The vectorizer should indicate
when this happens becuase it is most likely unintended behaviour.

This patch adds a simple check for this behaviour by following floating
point stores in the original loop and checking if a floating point
conversion operation occurs.

Diff Detail

Event Timeline

jhuber6 created this revision.Jan 27 2021, 9:21 AM
jhuber6 requested review of this revision.Jan 27 2021, 9:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 27 2021, 9:21 AM

If anyone knows a better place to put this let me know. I couldn't find any existing infrastructure to run checks on the loop after its been vectorized.

This is already caught by -Wimplicit-float-conversion/-Wdouble-promotion though
https://godbolt.org/z/o5Pzh4

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9287

You should be checking the type of the value being stored, not the pointer's element type

fhahn requested changes to this revision.Jan 27 2021, 9:35 AM
fhahn added a subscriber: fhahn.

Please add test cases for the change.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9290

This potentially is a bit confusing, because a store has no uses (it's of type void). We are traversing the operands upwards.

9292

Can we collect all FP stores in a first pass, and then do the wordlist traversal for the all fp stores together afterwards? This way, we won't get duplicated remarks, if the same value is stored multiple times.

9636

Unrelated change?

9647

I think usually ORE->allowExtraAnalysis(LV_NAME); is used to check if extra analysis should be done for remarks.

This revision now requires changes to proceed.Jan 27 2021, 9:35 AM

This is already caught by -Wimplicit-float-conversion/-Wdouble-promotion though
https://godbolt.org/z/o5Pzh4

Yes, but ;)

If you don't vectorize it is half as bad, arguably. Signal to noise might make the FE warnings unusable, every conversion is flagged after all https://godbolt.org/z/6hEjfx

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9292

I was expecting to have a set with all FPExt that have been diagnoses, but either way not duplicating for an FPExt seems reasonable.

jdoerfert added inline comments.Jan 27 2021, 10:19 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9324

Nits:

  • make the function static.
  • Visited doesn't need to be a map, use a SmallPtrSet and check like this: if (!Visited.insert(I).second) continue;
jhuber6 updated this revision to Diff 319626.Jan 27 2021, 10:54 AM

Making suggested changes. I'll add a test case later.

jhuber6 updated this revision to Diff 319676.Jan 27 2021, 2:21 PM

Adding test. Changing store type check.

fhahn added a comment.Feb 16 2021, 1:27 PM

Thanks for the update. I left a few comments with respect to the test case. I think it would be good to add a few more tests to cover all cases in the worklist loop, including using instructions defined outside the loop in the fpext and multiple stores that share a fpext.

llvm/test/Transforms/LoopVectorize/mixed-precision-remarks.ll
2 ↗(On Diff #319676)

not needed

3 ↗(On Diff #319676)

not needed

5 ↗(On Diff #319676)

If this requires the x86_64 triple, it should go into the X86 subdirectory.

7 ↗(On Diff #319676)

Personally I think the IR in the test case is concise and the C version should not be needed.

14 ↗(On Diff #319676)

dso_local and local_unnamed_addr are not needed. Is #0 needed?

16 ↗(On Diff #319676)

none of the instructions in the block should be needed?

33 ↗(On Diff #319676)

!tbaa should not be needed here and at other places.

36 ↗(On Diff #319676)

!llvm.loop should not be needed.

49 ↗(On Diff #319676)

Could you try to prune the metadata a bit? I think we would only need a location for the instruction we generate a remark for.

jhuber6 updated this revision to Diff 324113.Feb 16 2021, 2:12 PM

Pruning test file metadata.

Pruning test file metadata.

Thanks, the test looks much better now! I think all that's missing is tests for the other cases mentioned earlier.

jhuber6 updated this revision to Diff 324370.Feb 17 2021, 11:37 AM

Adding test additional test case that checks for operands defined outside the
loop, and makes sure multiple remarks are not emitted for the same floating
point extension instruction.

fhahn accepted this revision.Feb 17 2021, 12:22 PM

LGTM thanks! If you can, it might be good to add another test case where we emit multiple remarks.

llvm/test/Transforms/LoopVectorize/mixed-precision-remarks.ll
33 ↗(On Diff #324370)

tiniest nit: now that we only branch to this block from for.body, it seems more naturally to have it at the end of the function :)

This revision is now accepted and ready to land.Feb 17 2021, 12:22 PM
jhuber6 updated this revision to Diff 324424.Feb 17 2021, 2:19 PM

Adding an extra check for multiple codes. I'll land it later today.

This revision was automatically updated to reflect the committed changes.