This is an archive of the discontinued LLVM Phabricator instance.

[Polly] Load/Store scalar accesses before/after the statement itself
ClosedPublic

Authored by Meinersbur on Oct 6 2015, 4:40 PM.

Details

Summary

Instead of generating implicit loads within basic blocks, put them before the instructions of the statment itself, including non-affine subregions. The region's entry node is dominating all blocks in the region and therefore the loaded value will be available there.

Implicit writes in block-stmts were already stored back at the end of the block. Now, also generate the stores of non-affine subregions when leaving the statement, i.e. in the exiting block.

This change is required for array-mapped implicits ("De-LICM") to ensure that there are no dependencies of demoted scalars within statments. Statement load all required values, operator on copied in registers, and then write back the changed value to the demoted memory. Lifetimes analysis within statements becomes unecessary.

Diff Detail

Repository
rL LLVM

Event Timeline

Meinersbur updated this revision to Diff 36675.Oct 6 2015, 4:40 PM
Meinersbur retitled this revision from to [Polly] Load/Store scalar accesses before/after the statement itself.
Meinersbur updated this object.
Meinersbur added reviewers: jdoerfert, grosser.
Meinersbur added a project: Restricted Project.
Meinersbur added subscribers: llvm-commits, pollydev.
jdoerfert added inline comments.Oct 6 2015, 11:28 PM
lib/CodeGen/BlockGenerators.cpp
1176 ↗(On Diff #36675)

This part and the one below seem tricky and I have to understand the change better to get why/what happens here.

The idea of a single value map seems reasonable though. Are these two changes coupled?

Meinersbur added inline comments.Oct 7 2015, 1:34 PM
lib/CodeGen/BlockGenerators.cpp
1176 ↗(On Diff #36675)

This part and the one below seem tricky and I have to understand the change better to get why/what happens here.

These conditions should become unecessary when I do the work to remove redunant accesses earlier in the Polly pipeline. Until then, this is required to keep the unit tests happy.

The idea of a single value map seems reasonable though. Are these two changes coupled?

Yes. Before it would be possible scalar loads are created in a BB, and then a BB processed later would look it up and use it although the two BBs might be unrelated in the dominator tree. (Practically I think the generateScalarLoads would just overwrite the older entry in ValueMap, but I am not brave enough to try it). After this patch all scalar loads are inserted into the EntryBB, which is guarantee to dominate everything in the subregion.

grosser edited edge metadata.Oct 8 2015, 12:24 AM

As discussed yesterday, the direction of this patch is good. I leave you two finish the patch review.

jdoerfert edited edge metadata.Oct 10 2015, 5:49 AM

There are 3 major points to this patch:

  1. The code movement between the copyBB methods seem not necessary and should be avoided to simplify the insertion point requirements needed.
  2. The dominance check (as well as the isPHI check before that) are complicated and I still do not follow what/how they are different to what was happening before. I think they somehow simplify the generated code but that only hides the underlying issue
  3. The general idea of one map and the early reload of scalars seems reasonable and looks good except what is mentioned in point 1) and 2).
lib/CodeGen/BlockGenerators.cpp
341 ↗(On Diff #36675)

If we keep the SetInsertPoint and the generateScalarXXX calls in the other copyBB function we don't need to set the insert point manually outside (noted somewhere below). If there is a reason you moved it here please explain.

1035 ↗(On Diff #36675)

BlockMap seems to be obsolet if we store the mapping in the ValueMap too. If so we could remove it now or later.

1038 ↗(On Diff #36675)

As mentioned above, this complicates the copyBB logic and could be hidden.

1195 ↗(On Diff #36675)

As mentioned in the phone call, we should never need to check dominance here as we didn't do it before. If this patch somehow requires that, it might not be the best way to do it.

Can you illustrate why this is needed (with an example) and why it is beneficial not to store the scalars as before? We simply generated scalar stores in the blocks they were defined and did not need to reason about insertion points and dominance at all.

test/Isl/CodeGen/non-affine-phi-node-expansion-3.ll
20 ↗(On Diff #36675)

While the result looks nice I think the way we achieved is not what we want. We should not generate the scalar write accesses in the SCoP if they are not needed but now we apparently filter them in the code generation.

Meinersbur updated this object.
Meinersbur edited edge metadata.

One LNT test was failing (consumer-lame). In some cases the SCEV expander can generate code into the current position. This yields invalid code if a neighboring block which is not dominated by the former block tries to expand the same SCEV.

The update undoes the use single ValueMap for entire subregion patch, therefore adds another map that contains all values that are potentially visible in other ScopStmts. Redundant maps to be cleaned up in a separate patch.

I would have liked a handwritten testcase , but I could not easily determine under which conditions SCEV expander does this.

  1. The code movement between the copyBB methods seem not necessary and should be avoided to simplify the insertion point requirements needed.

Undone in update

  1. The dominance check (as well as the isPHI check before that) are complicated and I still do not follow what/how they are different to what was happening before. I think they somehow simplify the generated code but that only hides the underlying issue

The dominance check will go away in a separate patch that depends on this one.

lib/CodeGen/BlockGenerators.cpp
341 ↗(On Diff #36675)

SetInsertPoint needs to be set before generateScalarLoads. RegionGenerator does not call generateScalarLoads before each copyBB. If moving Builder.SetInsertPoint(CopyBB->begin()) into copyBB, this overload would still need to call it before generateScalarStores.

1035 ↗(On Diff #36675)

I aggree, should be a separate change.

1195 ↗(On Diff #36675)

As mentioned in the phone call, we should never need to check dominance here as we didn't do it before. If this patch somehow requires that, it might not be the best way to do it.

As mentioned in the phone call and a message to the mailing list, the current code does generate such MemoryAccesses. Some later planned change will hopefully make this check go away.

Can you illustrate why this is needed (with an example) and why it is beneficial not to store the scalars as before? We simply generated scalar stores in the blocks they were defined and did not need to reason about insertion points and dominance at all.

Explained in summary message. It is not beneficial by itself, but required for later changes.

test/Isl/CodeGen/non-affine-phi-node-expansion-3.ll
20 ↗(On Diff #36675)

This is a side-effect because val0 and val1 are dominating the non-affine region's exit, i.e. it is not possible to have such a store in the exit block. A planned change will clean up the logic that generates MemoryAccesses.

I didn't not understand why check for instructions that shouldn't even be there, so I just removed the check.

Hi Michael,

to (hopefully) resolve some open misunderstandings and to get this patch in quickly, I looked myself through these changes. As I said earlier, the general direction seems clear, but when trying to understand your change in detail (and looking at Johannes questions) some implementation details remained unclear to me. Sorry for bring these up so late, but answering them would really help my understanding.

Best,
Tobias

lib/CodeGen/BlockGenerators.cpp
341 ↗(On Diff #37133)

@johannes: I think the point is that for region statements, Michael wants to generate scalar stores not before and after each basic block, but only once at the beginning and at the end of the entire ScopStmt. To my understanding "keep[ing] the SetInsertPoint and the generateScalarXXX in the other copyBB function" would allow the former, but not the latter. This is why Michael has difficulties to address your comment.

I have been thinking about this myself, but I do not see how to implement this better besides documenting that copyBB requires generateScalarLoads/Stores to be run explicitly before/after copyBB is called for all BBs in a ScopStmt.

438 ↗(On Diff #37133)

Nice simplification!

1048 ↗(On Diff #37133)

@johannes: I have no idea how this could be hidden assuming we want indeed "scalar copies only once at the beginning and at the end of the entire ScopStmt" as described in an earlier comment. Johannes, did you write this assuming "scalar copies for each basic block" or have you a solution in mind that works also for "scalar copies only once"?

1150 ↗(On Diff #37133)

It is not yet clear why we need a special implementation here. If I replace this code with just a call to BlockGenerator::generateScalarLoads(Stmt, BBMap)" all tests pass.

You comment here:

Only generate loads for PHIs in the entry block. Intra-ScopStmt PHIs are
copied directly.

but this comment does not seem to be related to the if-condition that follows, as inter-ScopStmt PHIs are not even modeled and do consequently not need to be skipped.

Did you assume we model Intra-ScopStmt PHI nodes or was there another reason for including this code that I missed?

1210 ↗(On Diff #37133)

As mentioned in the phone call and a message to the mailing list, the current code does generate such MemoryAccesses. Some later planned change will hopefully make this check go away.

Michael, without giving an example this is hard to understand. I commented out the code and tried to generate all scalar writes in the exit block, but this failed as you promised. I give now an example:

               bb1:
               br non-affine bb2, bb3

bb2:                                       bb3:
  a =                                      b =
  br exit                                  br exit


exit:
  val = phi [a, %bb2], [b, %bb3]

For this example we create the RegionStmt bb1->exit which has two PHI writes. It is important that these writes happen in bb2 and bb3. Neither the values written may dominate the exit block nor can we choose which value to write without knowing if the control flow passed through bb2 or bb3. Hence for PHI-node writes we actually need to create them in the basic block statement they are created in (this will always be an exiting statement of the Region). Am I correct this is the reason for these PHI-node special cases?

I tried to simplify the conditions above and came up with the following code which still passes all tests:

for (MemoryAccess *MA : Stmt) {                                                 
  if (MA->isExplicit() || MA->isRead())                                         
    continue;                                                                   
                                                                                
  Instruction *ScalarInst = MA->getAccessInstruction();                         
  Value *Val = MA->getAccessValue();                                            
                                                                                
  // In case we add the store into an exiting block, we need to restore the     
  // position for stores in the exit node.                                      
  auto SavedInsertionPoint = Builder.GetInsertPoint();                          

  BasicBlock *ExitingBB = ScalarInst->getParent();                              
  BasicBlock *ExitingBBCopy = BlockMap[ExitingBB];                              
  Builder.SetInsertPoint(ExitingBBCopy->getTerminator());                       
                                                                                
  auto Address = getOrCreateAlloca(*MA);                                        
                                                                                
  Val = getNewScalarValue(Val, R, Stmt, LTS, BBMap);                            
  Builder.CreateStore(Val, Address);                                            
                                                                                
  // Restore the insertion point if necessary.                                  
  Builder.SetInsertPoint(SavedInsertionPoint);                                  
}

Did I miss something important or are these simplifications correct?

Also, is it correct that we code-generate all PHI nodes at their normal statement locations? Hence, the copy-out code for PHI-nodes does not introduce any functional change?

test/Isl/CodeGen/non-affine-phi-node-expansion-3.ll
20 ↗(On Diff #37133)

This is a side-effect because val0 and val1 are dominating the non-affine region's exit, i.e. it is not possible to have such a store in the exit block. A planned change will clean up the logic that generates MemoryAccesses.

I have troubles following this reasoning. If this code was correct before, why would it now be incorrect? Would it break the no-scalar-in-statement dependency property you want to establish?

I didn't not understand why check for instructions that shouldn't even be there, so I just removed the check.

Which check are you talking about? I could not find a check in Polly's source code that was dropped.

Independent of the comments above, your test seems to be unnecessarily weak. AKA, it would also pass if the original output is generated. I think if the new output is indeed preferable, we should check that it is really generated.

test/Isl/CodeGen/scev_expansion_in_nonaffine.ll
18 ↗(On Diff #37133)

This test case currently fails for me due to the instruction number having changed from %64 to %39. Using FileCheck Variables [1], this dependence on the instruction number can be removed.

; CHECK-LABEL:  polly.stmt.if.then.110:                                          
; CHECK:          [[REGISTER:%[0-9]+]] = mul i64 %polly.indvar{{.*}}, 30         
; CHECK:          [[GEP:%scevgep[0-9]+]] = getelementptr i32, i32* %scevgep{{.*}}, i64 [[REGISTER]]
; CHECK:          store i32 0, i32* [[GEP]]                                      
                                                                                 
; CHECK-LABEL:  polly.stmt.if.else:                                              
; CHECK:          [[REGISTER:%[0-9]+]] = mul i64 %polly.indvar{{.*}}, 30         
; CHECK:          [[GEP:%scevgep[0-9]+]] = getelementptr i32, i32* %scevge{{.*}}, i64 [[REGISTER:%[0-9]+]]
; CHECK:          store i32 21, i32* [[GEP]]

Also, I am not fully sure what you check here. The stores you are CHECKing for seem to be due to array stores. Stmt_for_body_101TOfor_inc_114 does not have any scalar stores modeled in -polly-scops. In your comments you mention dominance, so maybe this is related to the condition "!DT.dominates(cast<Instruction>(Val)->getParent(), StmtExitBB))", but as this condition is only checked for scalar writes this does not seem to be the case. I also tried if your patch affects this test, but in my experiments it passes with and without the remaining patch applied. Could you help me to understand how this test case is affected by your change?

[1] http://llvm.org/docs/CommandGuide/FileCheck.html#filecheck-variables

Meinersbur added inline comments.Oct 13 2015, 10:58 AM
lib/CodeGen/BlockGenerators.cpp
1150 ↗(On Diff #37133)

Considering that I encountered intra-scopstmt writes I assumed that there will also be intra-scop reads. If this assumption is wrong we could indeed fall through BlockGenerator::generateScalarLoads directly.

1210 ↗(On Diff #37133)

There can be (unecessary) writes to a.s2a and b.s2a which are not necessarily in the exit(ing) block, i.e. somewhere in the middle. This violated the intention I had to not write to (not yet existing) MAPPED scalars in the middle of ScopStmt, making intra-ScopStmt lifetime analysis necessary.

Stores in the BBs appear in the reverse order. Not a problem from my POV though.

I don't know whether it is possible that there already is code in the ExitingBBCopy when it has been split up to insert the non-affine region. If yes, building code at the terminator might be after the instruction that have been before.

I wouldn't think too much about this code, I will make it go away in the patch I am working on at the moment.

test/Isl/CodeGen/non-affine-phi-node-expansion-3.ll
20 ↗(On Diff #37133)

I have troubles following this reasoning. If this code was correct before, why would it now be incorrect? Would it break the no-scalar-in-statement dependency property you want to establish?

Because this patch moves the store from the end of the BB where it has been defined to the region exit, where the value's definition is not necessarily dominating.

Taking your example:

               bb1:
               br non-affine bb2, bb3

bb2:                                       bb3:
  a =                                      b =
  <store %a, %a.s2a>
  br bb4                                  br exit

bb4:
  br exit:

exit:
  val = phi [a, %bb4], [b, %bb3]
  <store %a, %a.s2a>

The store can be in bb2, but not in exit. It cannot be in bb2 because of my intention to not have any writes in the middle of ScopStmts. It doesn't need to be written anyway because there can be no use of %a (i.e. without PHIs) beyond bb2/4.

Which check are you talking about? I could not find a check in Polly's source code that was dropped.

; CHECK-NEXT: store float %p_val1, float* %val1.s2a
; CHECK-NEXT: store float %p_val2, float* %val2.s2a

Independent of the comments above, your test seems to be unnecessarily weak. AKA, it would also pass if the original output is generated. I think if the new output is indeed preferable, we should check that it is really generated.

What test are you talking about?

This is not meant as a unit test of any new functionality, just an adaption to changed behavior.

test/Isl/CodeGen/scev_expansion_in_nonaffine.ll
18 ↗(On Diff #37133)

It is intended to fail before the diff update where instead of RegionMaps, a single ScalarMap was used. Johannes and me thought it would be okay, but this one test was failing. I made it to ensure that when trying to remove RegionMaps again, we have a unit test fail before an LNT test fail.

You could argue that this check is unrelated. I might commit it separately.

Hi Michael,

thank you for this quick and detailed answer. I added my comments inline. One general question I have if the original reason for this patch was to somehow handle the inter-non-affine-scop statement scalar dependences visible e.g. in non-affine-phi-node-expansion-4.ll? Meaning: if we do not generate them (as they are clearly not needed), would this possibly even remove the need for the entire patch? Or is the patch needed even if we drop them?

Best,
Tobias

lib/CodeGen/BlockGenerators.cpp
1150 ↗(On Diff #37133)

I might be wrong (even though I do not yet understand why). So let's try to get it right.

What do you mean by intra-scopstmt writes? Are these scalar write statements that are read in the same scopstmt? Did they appear only in RegionStmts and with PHI nodes? I am looking for an example that makes me understand what you mean. Does this pattern appear in one of the test cases that changed? Or are you talking about the in necessary read-write chains we see in non-affine-phi-node-expansion-4.ll?

What would be a intra-scopstmt read? 'intra' suggests at least two elements that are somehow related. So maybe two scalar reads from the same address? Such may indeed happen. The only thing that will not happen is a PHI-read of a PHI-node that is part of the same non-affine region statement.

1210 ↗(On Diff #37133)

There can be (unecessary) writes to a.s2a and b.s2a which are not necessarily in the exit(ing) block, i.e. somewhere in the middle. This
violated the intention I had to not write to (not yet existing) MAPPED scalars in the middle of ScopStmt, making intra-ScopStmt lifetime
analysis necessary.

Probably this was already explained in earlier discussions, but I do not yet fully understand what "MAPPED scalars" will be (probably the once that will be folded into array loads). To get a better feeling let me try to understand the intention of your changes. As far as I understand not inserting a.s2a and b.s2a somewhere in the middle of a non-affine region statement is important, but dropping the .s2a is not, right? Is the following code sufficient for what you need. It should generate all s2a in the exit block, but does still generate all s2a statements that we currently model.

for (MemoryAccess *MA : Stmt) {                                                 
  if (MA->isExplicit() || MA->isRead())                                         
    continue;                                                                   
                                                                                
  Instruction *ScalarInst = MA->getAccessInstruction();                         
  Value *Val = MA->getAccessValue();                                            
                                                                                
  // In case we add the store into an exiting block, we need to restore the     
  // position for stores in the exit node.                                      
  auto SavedInsertionPoint = Builder.GetInsertPoint();                          

  if (MA->isPHI()) {
    BasicBlock *ExitingBB = ScalarInst->getParent();                              
    BasicBlock *ExitingBBCopy = BlockMap[ExitingBB];                              
    Builder.SetInsertPoint(ExitingBBCopy->getTerminator());                       
  }
                                                                                
  auto Address = getOrCreateAlloca(*MA);                                        
                                                                                
  Val = getNewScalarValue(Val, R, Stmt, LTS, BBMap);                            
  Builder.CreateStore(Val, Address);                                            
                                                                                
  // Restore the insertion point if necessary.
  if (MA->isPHI() {
    Builder.SetInsertPoint(SavedInsertionPoint);
  }
}

(I mostly try to understand what are the requirements for your later patches)

I don't know whether it is possible that there already is code in the ExitingBBCopy when it has been split up to insert the non-affine region.
If yes, building code at the terminator might be after the instruction that have been before.\

Good point, but there should be no such code (besides branch statements and other control-flow conditions that would be unrelated).

I wouldn't think too much about this code, I will make it go away in the patch I am working on at the moment.

OK. Though I still try to be a little detailed to make sure I get a better feeling of the direction you are aiming. I was a little high-level last week and I think that may have caused some confusion. If I know where you are going, it is easier for me to get the upcoming patches.

test/Isl/CodeGen/non-affine-phi-node-expansion-3.ll
20 ↗(On Diff #37133)

I have troubles following this reasoning. If this code was correct before, why would it now be incorrect? Would it break the no-scalar-
in-statement dependency property you want to establish?

Because this patch moves the store from the end of the BB where it has been defined to the region exit, where the value's definition is not
necessarily dominating.

Taking your example:

              bb1:
             br non-affine bb2, bb3

bb2:                                       bb3:
 a =                                      b =
 <store %a, %a.s2a>
 br bb4                                  br exit

bb4:
  br exit:

exit:
 val = phi [a, %bb4], [b, %bb3]
 <store %a, %a.s2a>

The store can be in bb2, but not in exit. It cannot be in bb2 because of my intention to not have any writes in the middle of ScopStmts. It
doesn't need to be written anyway because there can be no use of %a (i.e. without PHIs) beyond bb2/4.

As far as I understand the test case that triggered this discussion is not affected by this issue. I just committed test/Isl/CodeGen/non-affine-phi-node-expansion-4.ll where the value passed to the PHI node indeed does not dominate the non-affine region's exit. Looking at the code your unmodified patch creates, I see the following suspicious statements in the generated code that were not created before your patch:

polly.stmt.loop.entry:                            ; preds = %polly.loop_header
  %val1.s2a.reload = load float, float* %val1.s2a
  %val2.s2a.reload = load float, float* %val2.s2a
  br label %polly.stmt.loop


polly.stmt.branch2:                               ; preds = %polly.stmt.branch1
  store float %val2.s2a.reload, float* %merge.phiops
  br label %polly.stmt.backedge.exit

Is this a bug in your patch?

Now as I understand the issue, it seems what we want to do is to ensure we do not create scalar loads in case the (PHI-node) use is within the
same region. I think this is indeed something we should avoid, but as Johannes pointed out we probably want to fix this already at the place where we model the scop statement. Specifically, the following code should be modified to take non-affine regions into account:

if (OpI) {                                                                   
  BasicBlock *OpIBB = OpI->getParent();                                      
  // As we pretend there is a use (or more precise a write) of OpI in OpBB   
  // we have to insert a scalar dependence from the definition of OpI to     
  // OpBB if the definition is not in OpBB.                                  
  if (OpIBB != OpBB) {                                                       
    addScalarReadAccess(OpI, PHI, OpBB);                                     
    addScalarWriteAccess(OpI);                                               
  }

The above seems to be an independent change that should go in before this patch.

Independent of the comments above, your test seems to be unnecessarily weak. AKA, it
would also pass if the original output is generated. I think if the new output is indeed preferable,
we should check that it is really generated.

What test are you talking about?
This is not meant as a unit test of any new functionality, just an adaption to changed behavior.

This test case. It is is important that the .s2a accesses are removed, the test should fail if this is not the case. I would just make this a CHECK-NEXT and also check for the (branch?) instruction that follows the .phiops store. (I committed this hardening already when adding non-affine-phi-node-expansion-4.ll)

test/Isl/CodeGen/scev_expansion_in_nonaffine.ll
18 ↗(On Diff #37133)

OK. I see. Good that you added the test case. It would indeed be better to commit it ahead of time to make clear that this test has been working before and still keeps working. Feel free to commit as is without further review (but please include the regex stuff).

Meinersbur updated this revision to Diff 37490.Oct 15 2015, 9:01 AM

Rebase to r250411

Dominance check still needed or phi_in_exit_early_lnt_failure_2.ll fails.

I could further investigate why why condition in generateScalarStores cannot be further simplified, but I don't think its worth bothering. D13762 should clean it all up.

lib/CodeGen/BlockGenerators.cpp
1170 ↗(On Diff #37490)

Your proposed code fails Isl/CodeGen/phi_in_exit_early_lnt_failure_2.ll with assertion:

Assertion `!verifyGeneratedFunction(S, *EnteringBB->getParent()) && "Verification of generated function failed"' failed.
test/Isl/CodeGen/non-affine-phi-node-expansion-3.ll
20–21 ↗(On Diff #37490)
%val1.s2a.reload = load float, float* %val1.s2a
%val2.s2a.reload = load float, float* %val2.s2a

appeared because I modified the check in generateScalarLoads that originally was `isa<PHINode>(Inst)`. The new condition also let loads through that are also not supposed to be there in the first place. I reinstated the original condition although I am not convinced it is correct There could be a legitimate PHI in the subregion's entry block.

Just to remind, this is going to be cleaned up in D13762 anyway.

Hi Michael,

it seems after r250411 the parts of the patch that caused concerns are not needed any more. Even though they might be replaced soon, I think it is worth committing a minimal patch. I am very bad in keep track of things that should be removed, so while at this lets try to avoid adding unnecessary complicated code (including the PHI node stuff that does not seem to be well understood). Maybe you could rebase the patch once again?

Best,
Tobias

lib/CodeGen/BlockGenerators.cpp
1170 ↗(On Diff #37490)

I just retested after 250411 and I do not see any failure.

test/Isl/CodeGen/non-affine-phi-node-expansion-3.ll
20–21 ↗(On Diff #37490)

After 250411 I do not see such loads being generated even with the just this simple implementation:

void RegionGenerator::generateScalarLoads(ScopStmt &Stmt, ValueMapT &BBMap) {
  return BlockGenerator::generateScalarLoads(Stmt, BBMap);
}
Meinersbur updated this revision to Diff 37533.Oct 15 2015, 3:50 PM

Rebase to r250439; removed now unecessary override of generateScalarLoads.

Meinersbur added inline comments.Oct 15 2015, 3:54 PM
lib/CodeGen/BlockGenerators.cpp
1151 ↗(On Diff #37533)

I still see the failure. I will look tomorrow into it.

test/Isl/CodeGen/non-affine-phi-node-expansion-3.ll
19–23 ↗(On Diff #37533)

Mmmh, I looked for fixing the suspicious instructions in non-affine-phi-node-expansion-4.ll first without noticing they are gone with 250411 as well.

Meinersbur updated this revision to Diff 37595.Oct 16 2015, 8:51 AM

Rebase to r250518; Remove now unnecessary code to detect useless MemoeryAccesses.

Hi Michael,

besides one minor comment this patch looks now good to me.

Best,
Tobias

lib/CodeGen/BlockGenerators.cpp
1114 ↗(On Diff #37595)

Was there a specific reason you moved away from the MA->isPHI()? Your new code seems to be functional equivalent, as TerminatorInstructions are only used as memory access instructions for PHI nodes, but I find this less clear than MA->isPHI().

grosser accepted this revision.Oct 16 2015, 3:24 PM
grosser edited edge metadata.

Forgot to accept the revision.

This revision is now accepted and ready to land.Oct 16 2015, 3:24 PM
Meinersbur added inline comments.Oct 17 2015, 8:18 AM
lib/CodeGen/BlockGenerators.cpp
1114 ↗(On Diff #37595)

Exit PHI node handling creates SCALAR access for incoming values, to handle them as escaping values. Effectively, it means that the created alloca has the suffix .phiops instead of .s2a.

Also see ScopInfo::addPHIWriteAccess()

Meinersbur added inline comments.

Comment at: lib/CodeGen/BlockGenerators.cpp:1114
@@ +1113,3 @@
+ // Implicit writes induced by PHIs must be written in the incoming blocks.
+ if (isa<TerminatorInst>(ScalarInst)) {

+ BasicBlock *ExitingBB = ScalarInst->getParent();

grosser wrote:

Was there a specific reason you moved away from the MA->isPHI()? Your new code seems to be functional equivalent, as TerminatorInstructions are only used as memory access instructions for PHI nodes, but I find this less clear than MA->isPHI().

Exit PHI node handling creates SCALAR access for incoming values, to handle them as escaping values. Effectively, it means that the created alloca has the suffix .phiops instead of .s2a.

Also see ScopInfo::addPHIWriteAccess()

OK. Maybe add a test case that would have failed with the code I proposed.

Best,
Tobias

Meinersbur edited edge metadata.

Rebase to D13848

OK. Maybe add a test case that would have failed with the code I proposed.

Isl/CodeGen/phi_in_exit_early_lnt_failure_2.ll already fails with that code (just tried it). Do you want a dedicated test case?

Meinersbur added a comment.

In http://reviews.llvm.org/D13487#269745, @grosser wrote:

OK. Maybe add a test case that would have failed with the code I proposed.

Isl/CodeGen/phi_in_exit_early_lnt_failure_2.ll already fails with that code (just tried it). Do you want a dedicated test case?

No, I just did not realize it would fail. (It did not in my earlier
experiments, but does now). Then its fine as it is.

Best,
Tobias

This revision was automatically updated to reflect the committed changes.

Hi Michael,

the latest version of this patch does not automatically apply as it has 'polly/trunk/' in the prefix. Could you possibly fix this?

It seems I got confused. This patch was already committed, it was still kept as a dependent patch in phabricator. When trying to apply the patch that was registered as depending on this one, 'arc patch' tried to reapply the patch that was already committed and failed. I resolved this by removing the dependency to the already committed patch.

OK, thanks. Because I do have this patch in my own repository, I never used arc patch by myself.