This is an archive of the discontinued LLVM Phabricator instance.

[Polly] [Fortran Support] Generate GPU kernels for Fortran arrays
Needs ReviewPublic

Authored by bollu on May 16 2017, 1:31 PM.

Details

Summary
  • Depends on D32967.
  • Have rudimentary test cases that this works.
  • Would like a thorough review in case I'm missing something obvious in the implementation.
  • Putting this up as a [WIP], want to cleanup some of the code.

Event Timeline

bollu created this revision.May 16 2017, 1:31 PM
Meinersbur edited edge metadata.May 18 2017, 9:18 AM

I expected a change in ScopBuilder::buildMemoryAccess as well. Which case there handles Fortran arrays?

bollu updated this revision to Diff 99689.May 21 2017, 3:23 AM
  • Update to new master with dependent patches being merged in.
bollu updated this revision to Diff 99693.May 21 2017, 6:30 AM
  • Fix style in PPCGCodeGeneration.cpp
  • Docyment methods & fix style in ScopInfo.h
  • Simplify`PPCGCodeGeneration.cpp - pollyBuildAstExprForStmt`

This title "Generate GPU kernels for Fortran arrays" is far far too general. You only modify how the "extent" is computed. Please explain in more detail what you are doing.

lib/CodeGen/PPCGCodeGeneration.cpp
133–136

Please explain why this is necessary.

2158–2165

I'd first try to derive the extent from the accesses and only if that fails, and only if that doesn't work, derive the maximal possible extend from the outermost array dimension. That should also work if it is not Fortran array, but whenever all dimension sizes are defined.

2183

Please no const of local variables.

I know you like this, but if applied consistently nearly every local variable must be const, adding a lot of syntactic clutter.

2449–2450

Is the coding style in this file supposed to be different than in the rest of Polly? (LLVM coding style says parameter names start with capital letter)

test/GPGPU/fortran-copy-kernel-affine.ll
2

You removed [WIP]. but TODO is still there.

14

Fortran

bollu updated this revision to Diff 99864.May 23 2017, 2:17 AM
  • [NFC wrt patch] Style fixes as suggested by @Meinersbur
bollu added inline comments.May 23 2017, 2:19 AM
lib/CodeGen/PPCGCodeGeneration.cpp
133–136

getAddressFunction() calls getAddressRelation() and then lexmins it. This will not work in case of a Fortran array because it's unbounded in the outermost dimension. So, we do the bounding manually.

2158–2165

Hm, so you wish to derive extents from the accesses for Fortran arrays as well? I'm not sure what you're saying, could you please clarify?

bollu updated this revision to Diff 99866.May 23 2017, 2:36 AM
  • [NFC wrt patch] simplify tests & remove cruft
bollu added a comment.May 23 2017, 2:38 AM

@Meinersbur: Updated, and left a comment on something I don't really understand.

lib/CodeGen/PPCGCodeGeneration.cpp
133–136

Wouldn't it a better approach to apply the domain constraints to the access relation in general? We could do this in getAccessFunction() itself, the intersection is done in both acceses.

Second, I am not sure this does what you intent to do. For non-affine access indices, this will just give you an expression to the first element of the array, but I think you will want to access other elements as well.

What you may want is a case distinction:

  1. The access is affine: Use base pts and access relation/function
  2. The access is non-affine: Use only the base ptr, use the original index expression.
2158–2165

Let's say you have an access

for (int i = 5; i < 10; i++)
  A[i]

For this access you need the elements A[5..9] to be transferred, but not A[0..4]. The older preexisting code would get you the range A[5..9]. Your Fortran code would get you the range A[0..N], which includes some unnecessary elements.

If you can compute the extent the former way, there is no reason to use the more pessimistic approach to transfer the whole array. You could even compute the extent which might be unbounded in some dimensions, and then interstect with the array dimensions if those are known.

bollu added inline comments.May 24 2017, 9:44 AM
lib/CodeGen/PPCGCodeGeneration.cpp
133–136

how do I know whether the access is affine or not? Do we register this information somewhere?

2158–2165

Ah, I see. OK, I get it now.

bollu added inline comments.May 26 2017, 2:49 AM
lib/CodeGen/PPCGCodeGeneration.cpp
133–136

found isAffine. However, in IslNodeBuilder.cpp, the following code exists:

assert(MA->isAffine() &&
       "Only affine memory accesses can be code generated");

so, I'm not sure how you wish to extract a pw_aff from the original expression, because in a non-affine case it would just be a relation to everything in the output, correct?

sample-relation.isl
[p_0_loaded_from_n, MemRef0_fortranarr_size, MemRef1_fortranarr_size] -> { Stmt_9[i0] -> MemRef0[o0] }
Meinersbur added inline comments.May 26 2017, 4:23 AM
lib/CodeGen/PPCGCodeGeneration.cpp
133–136

The assertion should only apply if setNewAccessRelation() is used. Otherwise it just tries to reproduce the old pointer, not generating it from the access relation.

Btw:
setNewAccessRelation() is already used by Scop::canonicalizeDynamicBasePtrs(). This effectively means we cannot do invariant load hoisting on base pointers of non-affine accesses. As discussed in D28518.

I'm not sure how you wish to extract a pw_aff from the original expression, because in a non-affine case it would just be a relation to everything in the output, correct?

Yes, and we only need the maximum extent of all accesses. In you example, Say the domain is

{ Stmt_9[i] : 0 <= i < 10 }

The extend without bound information would be:
{ MemRef0[i] } since it can access anything, even with bounded domain.
Given the array size information, we can intersect the exent with the maximum in-bounds elements, which will be

{ MemRef0[i] : 0 <= i < N }
Meinersbur added inline comments.May 26 2017, 4:24 AM
lib/CodeGen/PPCGCodeGeneration.cpp
133–136

N would be your MemRef0_fortranarr_size.

bollu added inline comments.May 26 2017, 4:53 AM
lib/CodeGen/PPCGCodeGeneration.cpp
133–136

Perhaps I am mis-understanding what the function does.

  1. Isn't pollyBuildAstExprForStmt supposed to return a map from the ID of the memory access to a pw_aff expression which represents the access index?
  1. If our _access_ is not affine, something like for(i) { A[i *i] };, we do not store the exact non-affine access, nor can this be represented as pw_aff. So, how do we create an isl_ast_expr *Access corresponding to this access?

Yes, and we only need the maximum extent of all accesses. In you example, Say the domain is

{ Stmt_9[i] : 0 <= i < 10 }
The extend without bound information would be:
{ MemRef0[i] } since it can access anything, even with bounded domain.
Given the array size information, we can intersect the exent with the maximum in-bounds elements, which will be

{ MemRef0[i] : 0 <= i < N }

I don't understand, how does this relate to the creation of the AddrFunc?

bollu added a comment.May 29 2017, 2:08 AM

@Meinersbur : ping, I believe I need some feedback from you on this to continue.

bollu updated this revision to Diff 100865.May 31 2017, 7:56 AM
  • Check if Fortran array's outermost dimension can be bounded before using the pwAff.
  • remove code related to non-linear accesses in pollyBuildAstExprForStmt.
bollu added a comment.May 31 2017, 7:57 AM

@Meinersbur: I've changed the way getExtent works to only use the outermost dimension parameter in case the size is not bounded statically. I've also changed the getAddressFunction stuff to only-linear.

Could you take another look at the code now? Thanks.

jdoerfert resigned from this revision.Jun 11 2019, 7:57 AM