This is an archive of the discontinued LLVM Phabricator instance.

[Polly] [IslCodeGenerator] Add OpenMP support

Authored by grosser on Sep 29 2014, 12:53 AM.



This hooks the existing OpenMP support, as it was already available in
the CLooG backend, to our isl code generator backend.

This is the last feature, that was only available in the CLooG backend and that
blocked the by-default use of the isl backend. As the isl backend is with this
commit not only at feature parity, but has several advantages (run-time
alias checks, delinerization, ...), we aim to switch the default code generation
backend to isl. This switch will take place after a brief in-tree testing period
for the new OpenMP support in our isl backend.

TODO: - Still needs to be documented/style-checked

  • We do not use -enable-polly-openmp yet
  • We may want to reduce the use of OpenMP specific variable names

Comments are already appreciated.

Diff Detail

Event Timeline

grosser updated this revision to Diff 14153.Sep 29 2014, 12:53 AM
grosser retitled this revision from to [Polly] [IslCodeGenerator] Add OpenMP support.
grosser updated this object.
grosser edited the test plan for this revision. (Show Details)
grosser added a reviewer: jdoerfert.
grosser added a subscriber: Unknown Object (MLST).
jdoerfert edited edge metadata.Sep 29 2014, 6:12 AM

Reviewed half of it but have to go now


This change can/should be commited earlier, LGTM btw. but is there a particular reason for MapVector instead of SmallDenseMap or just DenseMap?


I don't like the comment (style + content). It doesn't give any more information than the type of the map.


Please, do not use this awfull OMP any more but talk about parallel function arguments or sth. similar.

(also documentation)


When I see the type here I remember my patch to generalize the code generation (LoopGenerators). I hoped you would review that and work on that as a base. Any reasons not to?


isn't createForParallel much nicer?


I usually do not like this Object producing methodes, but prefere the object to be passed as reference instead, especially for Vectors/Sets/Maps/etc.


Any reason not to make this a static function? I'm not sure about such lambda's myself and I think we always used static functions so far.


I don't really see the benefit of static_cast here as User is a black box void*, however it might be good to start using static_cast instead of C-style casts all the time.


That is probably not going to work. The instruction might be in the SCoP but outside the parallel loop, thus still needed as an argument for the parallel subfunction.

If you agree, then I suggest to use the union_set_foreach_set only to collect all Stmts/BBs inside of For and then iterate again over them. This time we do the instruction operand checks but we check if the parent of an instruction operand is in the set of basic blocks we collected earlier. This way we should be able to identify all inner SCoP instructions we need to pass as arguments.


I would like the negated check better:

if (isa<Constant> || isa<Global>)



then continue, otherwise assume we need it.


Why back to std::set?


We could remove the need for the extra set by looking up I.first in IDToValue, if found we continue otherwise we go on.
Or even better we merge the two loops and use this lookup to decide what action to take. But now it looks to good to be true, what did I miss?


Back to stdlib...


This is hard to understand. If I got it correctly (but please tell me if I'm wrong), you collect all nodes of a function in post order in a set and erase them afterwards from the dominator tree of the function, right?


I stoped reviewing here, will continue tonight.

I still haven't looked at the test cases but I commented the rest of the patch. We should add more test cases and review all of them later.


remove the newline


I would call it It and ItId but that's not important.


I still do not know why we need this, shouldn't we just create the domtree on the fly (when we create the subfunc) correctly?


You already mentioned --enable-polly-openmp but I would like to rename that option when we are at it, e.g.,


This needs some docu, I don't know why we need this here? Whats in which map?


Where is that one used? Was it missing before?

grosser updated this revision to Diff 15884.Nov 6 2014, 11:17 AM
grosser edited edge metadata.

This new patch has the following changes:

  • Addressed most of the review comments of Johannes
  • Removed a couple of smaller changes that have been committed individually
  • Extracted the changes to the IslAst into
  • Full support for SCEV based code generation
  • Tested on LNT with and without SCEV based code generation
    • 1 failure with SCEV based code generation due to
    • 2 failures due to alias information being invalidated by openmp code generation

Sorry for the delay. I just pushed out a new patch which aims to address
your comments.

Some "final" comments. Maybe you want to address some of them. If LNT looks good I would be fine with this patch.


one more / please


What loop are left?




The name is not really descriptive.


I'm still not really convinced this is better/nicer than a static function.


I don't understand this. Can you add a comment, otherwise this is a magic condition that cannot be maintained or debuged.


I usually like the SmallPtrSet but that's taste I guess.


What about a flag to the loopgenerator such that it will not add these blocks to the DT at all (or just omit the DT argument)?


I don't understand the comment? e.g., What do you mean by ahead of time? When do we generate the references then?


There is another type of "SCEVVisitor", Sebastian pointed it out to me once. IT only has one function something like:

visit(const SCEV *S);

and you cast the SCEV yourself. I would suggest using that for visitors that are only interested in one or two SCEV types.


Same as above

simbuerg added inline comments.



Extract the values and SCEVs needed to generate code for a ScopStmt.


... when generating code for this statement.


Most likely the static function would get a descriptive name, such as: isGlobalValue, right? That is quite redundant given the only statement this function executes is: return isa<GlobalValue>(V);

Furthermore, I do not think that this static function would get called from somewhere else in here, so whats the point in having it around on the top-level.


... generate code for SCEVs...


Nice find, that simplifies so many things

grosser updated this revision to Diff 16203.Nov 14 2014, 4:23 AM

Adresses the latest review comments from Andreas and Johannes

In the previous patch, I addressed most reviewer comments.

Two points where I did not follow reviewer suggestions:

  • I kept the small lambda function.
  • I clean up the dominance tree at the end to stay in sync with CLooG (but added a FIXME to possibly improve on this later)





I added:

"Loops that are before the scop, but do not contain the scop itself are considered not locally defined."


I added:

+ / Create LLVM-IR that executes a for node thread parallel.
+ /// @param For The FOR isl_ast_node for which code is generated.






I forgot to add this to the patch, but I now renamed this to

void *UserPtr and struct FindValuesUser *User


I left the lambda for now.

Similar to Andreas I have the impression that this as good as we can find a use case for lambdas. Are you proposing to generally avoid lambdas?


This was already indirectly documented in the header of the surrounding function. For clarity I added a brief comment right above the stmt:

+ / Remove loops that contain the scop or that are part of the scop, as they
/ are considered local. This leaves only loops that are before the scop, but
+ /// do not contain the scop itself.




Interesting idea. This requires some work to understand if this is possible. I added a FIXME. For now I propose to use the same solution as we had in the CLooG backend.




Changed to:

+ We may also reference loops outside of the scop which do not contain the
scop itself, but as the number of such scops may be arbitrarily large we do
+ not generate code for them here, but only at the point of code generation
where these values are needed.


Changed. This really reduced the code needed here significantly.



jdoerfert accepted this revision.Nov 14 2014, 4:57 AM
jdoerfert edited edge metadata.

You can get a looks good to me now, if LNT looks fine too I'd suggest you commit it.
After formating of course ;)

This revision is now accepted and ready to land.Nov 14 2014, 4:57 AM
simbuerg accepted this revision.Nov 14 2014, 7:09 AM
simbuerg added a reviewer: simbuerg.
grosser closed this revision.Nov 15 2014, 1:50 PM

Committed in r222088