This is an archive of the discontinued LLVM Phabricator instance.

[Polly] [OpenMP] Update Subtree values for nested parallel loop
Needs ReviewPublic

Authored by mbdharan on May 24 2017, 3:05 PM.

Details

Summary

When generating code for passing values into the sub-function, update old values of arguments if any newer values are present. This ensures that values referenced are available to the caller of the sub-function.

Fixes llvm.org/PR33153

Diff Detail

Event Timeline

mbdharan created this revision.May 24 2017, 3:05 PM
grosser edited edge metadata.May 25 2017, 10:50 PM

Hi Mohan,

thanks for working on this bug. I am slightly surprised we even try to generate code with nested parallelism.

bool IslAstInfo::isExecutedInParallel(__isl_keep isl_ast_node *Node) {           
                                                                                 
  if (!PollyParallel)                                                            
    return false;                                                                
                                                                                 
  // Do not parallelize innermost loops.                                         
  //                                                                             
  // Parallelizing innermost loops is often not profitable, especially if        
  // they have a low number of iterations.                                       
  //                                                                             
  // TODO: Decide this based on the number of loop iterations that will be       
  //       executed. This can possibly require run-time checks, which again      
  //       raises the question of both run-time check overhead and code size     
  //       costs.                                                                
  if (!PollyParallelForce && isInnermost(Node))                                  
    return false;                                                                
                                                                                 
  return isOutermostParallel(Node) && !isReductionParallel(Node);                
}

checks specifically for outermost parallelism. Something seems broken here.

Even if we generate correct LLVM-IR, I am not sure nested parallelism makes sense when calling OpenMP semantically and especially not performance wise. I think as a first step we should investigate why we generate nested parallelism at all.

Altough I fixed the related bug in r343212, this patch might still be useful in case we ever would want to allow nested parallelism.