This is an archive of the discontinued LLVM Phabricator instance.

[GSoC 2016][Polly][WIP] Extend the jscop interface to allow the user to declare new arrays and to reference these arrays from access expressions
ClosedPublic

Authored by gareevroman on Jul 26 2016, 11:59 AM.

Details

Summary

Extend the jscop interface to allow the user to export arrays. It is required that already existing arrays of the list of arrays correspond to arrays of the SCoP. Each array that is appended to the list will be newly created. Furthermore, we allow the user to modify access expressions to reference any array in case it has the same element type.

Diff Detail

Repository
rL LLVM

Event Timeline

gareevroman retitled this revision from to [GSoC 2016][Polly][WIP] Extend the jscop interface to allow the user to declare new arrays and to reference these arrays from access expressions.
gareevroman updated this object.
gareevroman added a subscriber: pollydev.

This is a draft of the path. I think that it has at least the following drawbacks:

  1. Type of elements of newly created array is always double, even though it's different according to the extended jscop interface.

If I'm not mistaken, types of elements are represented by primitive types. Should we store their textual representation in JSONExporter to be able to map them to actual types?

  1. Type of dimension sizes of newly created arrays is always Int64.

I haven't understood yet how to properly get the type of dimension sizes in the JSONExporter.

  1. Dimensions of exported arrays are dimensions of ScopArrayInfos even though actual arrays have a smaller number of dimensions.

Should we try to understand whether new dimensions were introduced during delinearization?

I would be grateful for your comments and ideas.

grosser edited edge metadata.Jul 26 2016, 12:29 PM

This is a draft of the path. I think that it has at least the following drawbacks:

  1. Type of elements of newly created array is always double, even though it's different according to the extended jscop interface.

If I'm not mistaken, types of elements are represented by primitive types. Should we store their textual representation in JSONExporter to be able to map them to actual types?

We could, yes.

  1. Type of dimension sizes of newly created arrays is always Int64.

I haven't understood yet how to properly get the type of dimension sizes in the JSONExporter.

Which type. The type of the SCEVs given to getOrCreateScopArrayInfo. I believe they can have any (large enough) type. 64 bit should be fine.

  1. Dimensions of exported arrays are dimensions of ScopArrayInfos even though actual arrays have a smaller number of dimensions.

Should we try to understand whether new dimensions were introduced during delinearization?

We want to export our model, so we should just export what -polly-scops -analyze prints.

Best,
Tobias

include/polly/ScopInfo.h
2164 ↗(On Diff #65569)

Maybe call it createScopArrayInfo() to show that this function does the same as the function above.

lib/Analysis/ScopInfo.cpp
246 ↗(On Diff #65569)

I would prefer to not change the name, but to (optionally) set it when constructing the array.

3517 ↗(On Diff #65569)

Both cases do exactly the same. This seems wrong?

Do you want to do something like, then you don't even need the ArrayType variable and the condition:

for () {

   ElementType = ArrayType::get(ElementType, ....)
3521 ↗(On Diff #65569)

getTerminatorInst();

3522 ↗(On Diff #65569)

Please do not yet touch the IR here. I would prefer to just mark this ScopArrayInfo as 'virtual', e.g. by not having a baseptr and then generating the corresponding IR only in lib/Codegen/CodeGeneration.cpp.

lib/Exchange/JSONExporter.cpp
103 ↗(On Diff #65569)

This should probably say the new arrays.

336 ↗(On Diff #65569)

This could fit in polly::Scop

375 ↗(On Diff #65569)

ScopArrayInfo should have a getId method already.

496 ↗(On Diff #65569)
522 ↗(On Diff #65569)

Parsing the type is a good idea. Alternatively you could just export the size of the element type and always create an iX of the correct size. (Polly will always cast to the right type for actual memory accesses).

test/ScopInfo/create_arrays.ll
23 ↗(On Diff #65569)

This is complicated. Could you use a trivial example that just tests this functionality.
In fact, multiple examples that cover the different types, multi-dimensional arrays, ....

Also, please move this into test/Isl/CodeGen/MemAccess/ and add a code-generation test case as well.

test/ScopInfo/kernel_gemm___%bb8---%bb32.jscop
1 ↗(On Diff #65569)

Please save the original file as .jscop and the transformed file as .jscop.transformed

mreisinger added inline comments.
lib/Analysis/ScopInfo.cpp
3508 ↗(On Diff #65569)

I would say llvm:: can be omitted here since the surrounding uses of Type are unqualified too.

Meinersbur added inline comments.Jul 27 2016, 6:31 AM
lib/Exchange/JSONExporter.cpp
496 ↗(On Diff #65569)

RawStringOstream.str() == Array["type"].asString()

523 ↗(On Diff #65569)

for (;ArrayIdx < Arrays.size(); ArrayIdx+=1)
You might also consider
for (int i = ArrayIdx, End = Arrays.size(); i < End; i+=1)
as the more canonical form.

gareevroman edited edge metadata.

Thanks for the comments! I've tried to address them in this version of the patch.

gareevroman added inline comments.Jul 27 2016, 11:39 AM
lib/Analysis/ScopInfo.cpp
3511 ↗(On Diff #65781)

Thanks. The else branch should contain 'ArrayType = ArrayType::get(ArrayType, Sizes.at(Sizes.size() - i - 1));'.

3516 ↗(On Diff #65781)

Should we rewrite the Scop::getOrCreateScopArrayInfo so it creates a new ScopArrayInfo in case the baseptr is nullptr? Otherwise, it probably always returns the same object.

lib/Exchange/JSONExporter.cpp
512 ↗(On Diff #65781)

I think that textual representation of types is more intuitive. I've implemented its parsing in the last version of the patch using std::map. However, I'm not sure whether it is appropriate to initialize such a map in a function, where this map is used. What do you think about it?

I have one more question. Should we use StringMap instead of std::map?

Hi Roman, thank you for the update. This clearly goes in the right direction. I added a couple more comments.

Tobias

lib/Analysis/ScopInfo.cpp
3511 ↗(On Diff #65781)

The bug is fixed, but the code still looks unnecessarily complicated what about:

	​    if (!ArrayType)
               ArrayType = ElementType;

	​    ArrayType = ArrayType::get(ArrayType, Sizes.at(Sizes.size() - i - 1));
3516 ↗(On Diff #65781)

That seems to make sense, yes. This means we need to keep keep the list of ScopArrayInfo objects in one set and then have a separate map that contains references to map from baseptrs to scoparrayInfo objects that have baseptrs.

lib/Exchange/JSONExporter.cpp
512 ↗(On Diff #65781)

I am fine with a std::map. This is certainly not performance critical at the moment. I also think it is fine to initialize this in the function. The alternative would be to make this a global variable. This would speed up this very function, but slow down the startup of clang/opt as the map would always need to be initialized. Alternatively, you could make this a sorted array that is only contains strings and function pointers, and just do a binary search over the strings. This is probably the optimal solution in terms of speed, but as we use jscop currently just as a driver for our test cases compile time speed is not so important. Hence, I suggest to go for what you believe is the approach easiest to undertstand.

(We could also use parseType() from AsmParser/Parser.h, but this is only OK if opt, clang and bugpoint all link the AsmParser in already)

test/Isl/CodeGen/MemAccess/create_arrays.ll
21 ↗(On Diff #65781)

This already looks very good. Two recommendations:

  1. It would be great to make this test case really trivial. A single loop with a couple of memory accesses. Then introduce different accesses 1D/2D with different types to ensure the new code is really exercised. This large test case is useful for your GSoC overall goal, but it is larger than necessary for this patch. Also the access functions are unnecessarily complicated.
  1. Use array name A/B/C or A1/A2/A3. This is easier to read.
test/Isl/CodeGen/MemAccess/create_arrays___%bb8---%bb32.jscop.transformed
82 ↗(On Diff #65781)

If there is a benefit in these more complex test cases, please add a comment that states what additional functionality they exercise.

Hi Tobias,

thanks for the comments and quick replies!

lib/Exchange/JSONExporter.cpp
513 ↗(On Diff #66003)

OK. Thanks for the explanation.

Hi Roman,

thanks for the update!

include/polly/ScopInfo.h
1403 ↗(On Diff #66003)

I would make this a plain pointer and use a unique_ptr to track ownership in ScopArrayInfoSet

Also, you can make this a normal Map (without vector), as we won't iterate over this Map any more (instead we use the vector which contains _all_ arrays).

1415 ↗(On Diff #66003)

Why do we need a special compare function? The automatically derived comparision that checks for pointer equality and different integers in the second element of the pair should be enough, no?

1418 ↗(On Diff #66003)

Make this a SetVector, otherwise the iteration order is not clear.

lib/Analysis/ScopInfo.cpp
3493 ↗(On Diff #66003)

Can BaseName and BasePtr ever be set both? If not, please add an assert and document this.

3498 ↗(On Diff #66003)

Uff, this is rather involved. You make these arrays comparable via strings to support the case that we want to get back a ScopArrayInfo reference from its string name. As we probably need to support this, I would suggest to make this more explicit using a MapVector from std::string to ScopArrayInfo.

3499 ↗(On Diff #66003)

BasePtr seems is nullptr here, right? In this case, I do not think it makes sense to add the ScopArrayInfo into this map, as all of the new arrays will overwrite each other, no?

lib/CodeGen/CodeGeneration.cpp
151 ↗(On Diff #66003)

Can you make this part of IslNodeBuilder. CodeGeneration.cpp is just one user of the IslNodeBuilder (PPCGCodeGeneration is another and there may be more external ones).

156 ↗(On Diff #66003)

I suggest to move this further down to the area where we actually generate code. Maybe before NodeBuilder.addParameters()?

test/Isl/CodeGen/MemAccess/create_arrays.ll
22 ↗(On Diff #66003)

Nice.

gareevroman added inline comments.Jul 29 2016, 4:20 AM
include/polly/ScopInfo.h
1415 ↗(On Diff #66003)

Yes, I added it to make these arrays comparable via strings.

1418 ↗(On Diff #66003)

We preserve the order to be able to check that a SCoP contains arrays that we expect. Right?

lib/Analysis/ScopInfo.cpp
3493 ↗(On Diff #66003)

I thought that it's redundant, but it isn't a mistake. OK. I'll add an assert.

3498 ↗(On Diff #66003)

Maybe we should store all created arrays in a separate map. What do you think about it? Although it probably requires additional methods to get a new iterator range, it should possibly make maintenance of created arrays easier.

P.S.: I get the following compilation error, if I try to declare 'MapVector<std::string, std::unique_ptr<ScopArrayInfo>> ScopArrayInfoSet2;'. Possibly it doesn't suite for storing std::string.

"In file included from /home/roman/Documents/polly/llvm/include/llvm/Analysis/AliasSetTracker.h:20:0,

from /home/roman/Documents/polly/llvm/tools/polly/include/polly/ScopDetectionDiagnostic.h:26,
from /home/roman/Documents/polly/llvm/tools/polly/include/polly/ScopDetection.h:50,
from /home/roman/Documents/polly/llvm/tools/polly/include/polly/ScopInfo.h:21,
from /home/roman/Documents/polly/llvm/tools/polly/lib/Analysis/ScopInfo.cpp:20:

/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h: In instantiation of ‘void llvm::DenseMapBase<DerivedT, KeyT, ValueT, KeyInfoT, BucketT>::destroyAll() [with DerivedT = llvm::DenseMap<std::basic_string<char>, unsigned int, llvm::DenseMapInfo<std::basic_string<char> >, llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int> >; KeyT = std::basic_string<char>; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<std::basic_string<char> >; BucketT = llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int>]’:
/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h:617:5: required from ‘llvm::DenseMap<KeyT, ValueT, KeyInfoT, BucketT>::~DenseMap() [with KeyT = std::basic_string<char>; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<std::basic_string<char> >; BucketT = llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int>]’
/home/roman/Documents/polly/llvm/include/llvm/ADT/MapVector.h:32:7: required from here
/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h:308:57: error: ‘isEqual’ is not a member of ‘llvm::DenseMapInfo<std::basic_string<char> >’

!KeyInfoT::isEqual(P->getFirst(), TombstoneKey))
                                              ^

/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h:307:53: error: ‘isEqual’ is not a member of ‘llvm::DenseMapInfo<std::basic_string<char> >’

if (!KeyInfoT::isEqual(P->getFirst(), EmptyKey) &&
                                              ^

/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h: In instantiation of ‘static const KeyT llvm::DenseMapBase<DerivedT, KeyT, ValueT, KeyInfoT, BucketT>::getEmptyKey() [with DerivedT = llvm::DenseMap<std::basic_string<char>, unsigned int, llvm::DenseMapInfo<std::basic_string<char> >, llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int> >; KeyT = std::basic_string<char>; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<std::basic_string<char> >; BucketT = llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int>]’:
/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h:305:39: required from ‘void llvm::DenseMapBase<DerivedT, KeyT, ValueT, KeyInfoT, BucketT>::destroyAll() [with DerivedT = llvm::DenseMap<std::basic_string<char>, unsigned int, llvm::DenseMapInfo<std::basic_string<char> >, llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int> >; KeyT = std::basic_string<char>; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<std::basic_string<char> >; BucketT = llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int>]’
/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h:617:5: required from ‘llvm::DenseMap<KeyT, ValueT, KeyInfoT, BucketT>::~DenseMap() [with KeyT = std::basic_string<char>; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<std::basic_string<char> >; BucketT = llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int>]’
/home/roman/Documents/polly/llvm/include/llvm/ADT/MapVector.h:32:7: required from here
/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h:392:34: error: ‘getEmptyKey’ is not a member of ‘llvm::DenseMapInfo<std::basic_string<char> >’

return KeyInfoT::getEmptyKey();
                             ^

/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h: In instantiation of ‘static const KeyT llvm::DenseMapBase<DerivedT, KeyT, ValueT, KeyInfoT, BucketT>::getTombstoneKey() [with DerivedT = llvm::DenseMap<std::basic_string<char>, unsigned int, llvm::DenseMapInfo<std::basic_string<char> >, llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int> >; KeyT = std::basic_string<char>; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<std::basic_string<char> >; BucketT = llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int>]’:
/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h:305:73: required from ‘void llvm::DenseMapBase<DerivedT, KeyT, ValueT, KeyInfoT, BucketT>::destroyAll() [with DerivedT = llvm::DenseMap<std::basic_string<char>, unsigned int, llvm::DenseMapInfo<std::basic_string<char> >, llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int> >; KeyT = std::basic_string<char>; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<std::basic_string<char> >; BucketT = llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int>]’
/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h:617:5: required from ‘llvm::DenseMap<KeyT, ValueT, KeyInfoT, BucketT>::~DenseMap() [with KeyT = std::basic_string<char>; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<std::basic_string<char> >; BucketT = llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int>]’
/home/roman/Documents/polly/llvm/include/llvm/ADT/MapVector.h:32:7: required from here
/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h:395:38: error: ‘getTombstoneKey’ is not a member of ‘llvm::DenseMapInfo<std::basic_string<char> >’

return KeyInfoT::getTombstoneKey();
                                 ^

[100%] Generating LLVMLTORevision.h
[100%] Built target LLVMInterpreter
[100%] Built target LLVMAArch64CodeGen
[100%] Built target LLVMAMDGPUCodeGen
[100%] Built target LLVMBPFCodeGen
[100%] Built target LLVMARMCodeGen
[100%] Built target LLVMMSP430CodeGen
/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h: In static member function ‘static const KeyT llvm::DenseMapBase<DerivedT, KeyT, ValueT, KeyInfoT, BucketT>::getEmptyKey() [with DerivedT = llvm::DenseMap<std::basic_string<char>, unsigned int, llvm::DenseMapInfo<std::basic_string<char> >, llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int> >; KeyT = std::basic_string<char>; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<std::basic_string<char> >; BucketT = llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int>]’:
/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h:393:3: warning: control reaches end of non-void function [-Wreturn-type]

}
^

/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h: In static member function ‘static const KeyT llvm::DenseMapBase<DerivedT, KeyT, ValueT, KeyInfoT, BucketT>::getTombstoneKey() [with DerivedT = llvm::DenseMap<std::basic_string<char>, unsigned int, llvm::DenseMapInfo<std::basic_string<char> >, llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int> >; KeyT = std::basic_string<char>; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<std::basic_string<char> >; BucketT = llvm::detail::DenseMapPair<std::basic_string<char>, unsigned int>]’:
/home/roman/Documents/polly/llvm/include/llvm/ADT/DenseMap.h:396:3: warning: control reaches end of non-void function [-Wreturn-type]

}"
3499 ↗(On Diff #66003)

Right.

lib/CodeGen/CodeGeneration.cpp
151 ↗(On Diff #66003)

OK.

156 ↗(On Diff #66003)

OK.

Brief comments.

include/polly/ScopInfo.h
1415 ↗(On Diff #66003)

Right. I personally would prefer to leave SAI comparable via Pointers and use a map from strings to SAI to find the correct SAI for a given string.

1418 ↗(On Diff #66003)

By using a set vector we ensure a deterministic order when enumerating the elements in this set. This is important to ensure that the order in which you create AllocaInstructions is deterministic and does _not_ depend on the memory allocator on which Polly runs. Otherwise, test case outputs may change between systems or even on the same system.

lib/Analysis/ScopInfo.cpp
3498 ↗(On Diff #66003)

I would prefer to keep a single set that has ownership of all SAI.

Maybe you could use a StringMap to go from StringToSAI, and a separate std::SetVector to keep ownership of the set of SAI available..

I would make this a plain pointer and use a unique_ptr to track ownership in ScopArrayInfoSet

Since we want to use the MapVector, we should probably use plain pointers instead of a unique_ptrs. Otherwise, we get an error that is similar to the one caused by using std::string as a type of its keys.

Maybe we should use a unique_ptr to track ownership in ScopArrayInfoMap and ScopArrayNameMap?

P.S. I generate alloca in split blocks, because after their creation in executeScopConditionally getEnteringBlock returns a pointer to such a block. It seems that they are suitable for us, because, if I'm not mistaken, they are predecessors of SCoPs.

grosser accepted this revision.Jul 29 2016, 9:50 AM
grosser edited edge metadata.

LGTM. Can you please test with POLLY_GPGPU enabled in cmake. If that passes, feel free to commit.

include/polly/CodeGen/IslNodeBuilder.h
58 ↗(On Diff #66140)

For all NEW arrays ...

lib/CodeGen/BlockGenerators.cpp
490 ↗(On Diff #66140)

Nice.

This revision is now accepted and ready to land.Jul 29 2016, 9:50 AM
grosser added inline comments.Jul 30 2016, 1:11 AM
lib/CodeGen/IslNodeBuilder.cpp
1178 ↗(On Diff #66140)

Just realized. This is not correct. Use Builder.GetCurrentBlock()->getParent()->getEntry();

Otherwise, if there is a loop around the scop, we will run out of stack space.

gareevroman added inline comments.Jul 30 2016, 1:45 AM
lib/CodeGen/IslNodeBuilder.cpp
1178 ↗(On Diff #66140)

I get the following error:

"error: ‘polly::PollyIRBuilder’ has no member named ‘GetCurrentBlock’"

If I'm not mistaken, IRBuilder doesn't have methods that return basic blocks.

This revision was automatically updated to reflect the committed changes.

I get the following error:

"error: ‘polly::PollyIRBuilder’ has no member named ‘GetCurrentBlock’"

If I'm not mistaken, IRBuilder doesn't have methods that return basic
blocks.

GetInsertBlock()

OK. Thanks for the comment!

P.S.: I've also updated access to arrays of SCoPs from CodeGen/PPCGCodegGeneration.cpp. Please revert the patch, if it's wrong. The last version of the patch passes tests with POLLY_GPGPU enabled and doesn't cause a regression in case of tests from the LNT test suite.