This is an archive of the discontinued LLVM Phabricator instance.

[LV] fold-tail flag
ClosedPublic

Authored by dorit on Aug 12 2019, 1:11 PM.

Details

Summary

This is the compiler-flag equivalent of the Predicate pragma (https://reviews.llvm.org/D65197), to direct the vectorizer to fold the remainer-loop into the main-loop using predication.

Diff Detail

Event Timeline

dorit created this revision.Aug 12 2019, 1:11 PM
SjoerdMeijer added inline comments.Aug 12 2019, 11:34 PM
test/Transforms/LoopVectorize/X86/tail_loop_folding.ll
2

Because these test cases have the vectorize.predicate.enable metadata set, I am not sure we are actually testing this new option. I don't think so, I think we need a separate function without the predicate metadata.

17

I expect the output to be the same whether a pragma was used or this new options, so can we just use the CHECK tag?

dorit marked 2 inline comments as done.Aug 13 2019, 2:13 AM

Thanks for taking a look! Please see responses below.

test/Transforms/LoopVectorize/X86/tail_loop_folding.ll
2

"Because these test cases have the vectorize.predicate.enable metadata set"

not really, the second function has the vectorize.predicate.enable metadata *disabled*. This is why in the second function *without* the new flag the check for no masked loads/stores passes (see CHECK part), and *with* the new flag the check expects to find masked loads/stores (see PREDFLAG part).

17

I'm not sure I understand what you are suggesting... I'm trying to distinguish between two runs - one without the flag, and one with the flag. In the first function the output is indeed the same for both runs; but this is not the case in the second function: if I'll have the second run (with the flag) check all the CHECK tags I will fail in the second function where the output differs.

SjoerdMeijer accepted this revision.Aug 13 2019, 2:59 AM

Ah sorry, ignore me! I messed that up.

This looks like a good and straightforward change to me.

This revision is now accepted and ready to land.Aug 13 2019, 2:59 AM
This revision was automatically updated to reflect the committed changes.