This is an archive of the discontinued LLVM Phabricator instance.

[LV] Enable stride versioning to support Fortran IR
Needs ReviewPublic

Authored by peixin on Apr 4 2023, 8:33 AM.

Details

Summary

The previous versioning speculates the stride to be one. This is
not true for Fortran IR generated by current LLVM Flang. One
example is as follows:

%addr = getelementptr i8, ptr %array, i64 %offset
%val = load float, ptr %addr

In this case, the stride unit is 4 bytes, so the stride should be
speculated to be 4. This patch enables stride versioning to support
Fortran IR by compare the access type and Gep result type.

With this patch, Fortran code with pointer array and assumed-shape
array can be stride versioned in LV.

Diff Detail

Event Timeline

peixin created this revision.Apr 4 2023, 8:33 AM
peixin requested review of this revision.Apr 4 2023, 8:33 AM
reames added a comment.Apr 5 2023, 7:43 AM

JFYI, I find the description on this patch confusing. I think I've managed to understand it, but let me confirm.

Given an induction variable with a loop invariant (but not constant) stride, LAA currently speculates the stride is 1. In the case where the access type and the index type differ, this results in a meaningless speculation. Instead of unconditionally speculating 1, we can speculate that the stride is the constant required to stride sizeof(element-type) on each iteration.

Is that a correct understanding?

Also, why does this depend on the prior patch? This seems like an independent change.

llvm/test/Transforms/LoopVectorize/stride-accesses-unit-check-fix.ll
34

Please use update_test_checks.py

56

Please reduce this test further.

96

Please remove stray metadata.

102

You definitely need more than one test here. A few cornercases to consider:

  • element type == index type
  • element type < index type
  • gep with multiple uses
peixin updated this revision to Diff 511121.Apr 5 2023, 8:55 AM
peixin edited the summary of this revision. (Show Details)
peixin added a comment.Apr 5 2023, 9:04 AM

JFYI, I find the description on this patch confusing. I think I've managed to understand it, but let me confirm.

Given an induction variable with a loop invariant (but not constant) stride, LAA currently speculates the stride is 1. In the case where the access type and the index type differ, this results in a meaningless speculation. Instead of unconditionally speculating 1, we can speculate that the stride is the constant required to stride sizeof(element-type) on each iteration.

Is that a correct understanding?

Yes, that's what this patch does. Sorry to bother you. Should I change the description like you said?

Also, why does this depend on the prior patch? This seems like an independent change.

The IR filecheck depends on the prior patch since I develop this patch after that locally. I removed the dependence now.

llvm/test/Transforms/LoopVectorize/stride-accesses-unit-check-fix.ll
34

Thanks. Fixed.

56

I renamed the IR variable names and try my best to reduce it. It can still be reduced to reproduce the problem, but it will have different meanings as the source code.

96

Thanks. Fixed.

102

element type == index type

Added it.

element type < index type

It does not exist in Fortran IR. So in function code, I still use 1 for this case.

gep with multiple uses

Do you mean the following:

%0 = getelementptr i8, ptr %base, i64 %offset
%1 = load float, ptr %0
%2 = load i32, ptr %0

I cannot find one IR generated from real code including C/C++/Fortran.

peixin updated this revision to Diff 511998.Apr 9 2023, 6:09 AM
peixin edited the summary of this revision. (Show Details)

Rebase and refine the implementation so that it is more readable.

ping for review

junaire added inline comments.
llvm/lib/Analysis/LoopAccessAnalysis.cpp
173
176
178
peixin updated this revision to Diff 513916.Apr 15 2023, 8:44 AM
  1. Address the comments.
  2. Reverse to the previous implementation since it is more robust. The use of gep instruction may not be load or store instruction.
peixin added a comment.Jul 7 2023, 8:50 AM

@kiranchandramohan Hi Kiran, could you please help commandeer this patch? I cannot continue working on it due to work changes.

david-arm added inline comments.Jul 10 2023, 1:53 AM
llvm/test/Transforms/LoopVectorize/stride-accesses-unit-check-fix.ll
34

Not sure if we should be including target-specific things for tests in the top level LoopVectorize directory?

peixin added inline comments.Jul 10 2023, 1:59 AM
llvm/test/Transforms/LoopVectorize/stride-accesses-unit-check-fix.ll
34

It's common to include the target datalayout.

$ grep -r "target datalayout" llvm/test/Transforms/LoopVectorize/
llvm/test/Transforms/LoopVectorize/pr39417-optsize-scevchecks.ll:target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
llvm/test/Transforms/LoopVectorize/no_array_bounds.ll:target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
llvm/test/Transforms/LoopVectorize/use-scalar-epilogue-if-tp-fails.ll:target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
llvm/test/Transforms/LoopVectorize/libcall-remark.ll:target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll:target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"

Usually we don't include target triple. But target datalayout is required to run opt.

@kiranchandramohan BTW, please notice that ValueToValueMap in Loop Access Analysis is replaced by DenseMap<Value *, const SCEV *> in D147750 if you commandeer this patch and rebase it.

@kiranchandramohan BTW, please notice that ValueToValueMap in Loop Access Analysis is replaced by DenseMap<Value *, const SCEV *> in D147750 if you commandeer this patch and rebase it.

Thanks @peixin. Will keep that in mind.