This is an archive of the discontinued LLVM Phabricator instance.

Fix bug in llvm::DemoteRegToStack
Needs ReviewPublic

Authored by ahatanak on Dec 18 2014, 1:47 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

In the following example, when DemoteRegToStack demotes %call.i.i.i14.i.i, it firsts scans all the uses of %call.i.i.i14.i.i and inserts load instructions before all its users. Since %0 is a phi instruction, a load instruction is inserted in the predecessor block, which is the entry basic block. Then it inserts a store instruction to save the value it is demoting to the stack. In this case, %call.i.i.i14.i.i is an invoke, so it creates a basic block that splits the critical edge (entry-do.body.i.i.i), and inserts the store there. This is incorrect because the store instruction should be executed before the load instruction.

define void @_Z4foo1c(i8 signext %a) {
entry:

...
%call.i.i.i14.i.i = invoke noalias i8* @_Znwm(i32 1024)
        to label %do.body.i.i.i unwind label %lpad.body

do.body.i.i.i: ; preds = %entry, %_ZNSt3__116allocator_traitsINS_9allocatorIcEEE9constructIccEEvRS2_PT_RKT0_.exit.i.i.i

%0 = phi i8* [ %incdec.ptr.i.i.i, %_ZNSt3__116allocator_traitsINS_9allocatorIcEEE9constructIccEEvRS2_PT_RKT0_.exit.i.i.i ], [ %call.i.i.i14.i.i, %entry ]
...

The IR after SjLjEHPrepare pass looks like this:

define void @_Z4foo1c(i8 signext %a) {
entry:

%call.i.i.i14.i.i.reg2mem = alloca i8*
 ...
%call.i.i.i14.i.i.reload = load volatile i8** %call.i.i.i14.i.i.reg2mem
...
%call.i.i.i14.i.i = invoke noalias i8* @_Znwm(i32 1024)
        to label %entry.do.body.i.i.i_crit_edge unwind label %lpad.body

entry.do.body.i.i.i_crit_edge: ; preds = %entry

store i8* %call.i.i.i14.i.i, i8** %call.i.i.i14.i.i.reg2mem
br label %do.body.i.i.i

do.body.i.i.i: ; preds = %entry.do.body.i.i.i_crit_edge, %_ZNSt3__116allocator_traitsINS_9allocatorIcEEE9constructIccEEvRS2_PT_RKT0_.exit.i.i.i

%3 = phi i8* [ %incdec.ptr.i.i.i, %_ZNSt3__116allocator_traitsINS_9allocatorIcEEE9constructIccEEvRS2_PT_RKT0_.exit.i.i.i ], [ %call.i.i.i14.i.i.reload, %entry.do.body.i.i.i_crit_edge ]

This patch fixes the bug by changing llvm::DemoteRegToStack to insert the store instruction and create basic blocks that split critical edges first, and then insert the instructions to reload the value.

Diff Detail

Event Timeline

ahatanak updated this revision to Diff 17467.Dec 18 2014, 1:47 PM
ahatanak retitled this revision from to Fix bug in llvm::DemoteRegToStack.
ahatanak updated this object.
ahatanak edited the test plan for this revision. (Show Details)
ahatanak added a subscriber: Unknown Object (MLST).
  • lib/Transforms/Utils/DemoteRegToStack.cpp

+++ lib/Transforms/Utils/DemoteRegToStack.cpp
@@ -21,6 +21,7 @@

/// alloca.  This allows the CFG to be changed around without fear of
/// invalidating the SSA information for the value.  It returns the

pointer to

/// the alloca inserted to create a stack slot for I.

+

AllocaInst *llvm::DemoteRegToStack(Instruction &I, bool VolatileLoads,
                                   Instruction *AllocaPoint) {
  if (I.use_empty()) {

Remove the extra blank line, this comment is attached to this function.

+ StoreInst *Store = new StoreInst(&I, Slot, InsertPt);
+
+ Change all of the users of the instruction, except the Store we've
just
+
inserted, to read from the stack slot.
+ SmallVector<User*, 4> UsersOfI(I.users());
+
+ for (User *U : UsersOfI) {
+ if (PHINode *PN = dyn_cast<PHINode>(U)) {

Can you avoid the extra copy? What if you create the store now, but
don't insert it yet? (I think InsertPt can't be one of the users, correct?)

While we're here:

for (; isa<PHINode>(InsertPt) || isa<LandingPadInst>(InsertPt);

++InsertPt)

/* empty */;   // Don't insert before PHI nodes or landingpad instrs.

This is redundant with getFirstInsertionPt. Can this be hoisted up to
the only place that sets InsertPt without using getFirstInsertionPt?

Nick

Akira Hatanaka wrote:

In the following example, when DemoteRegToStack demotes %call.i.i.i14.i.i, it firsts scans all the uses of %call.i.i.i14.i.i and inserts load instructions before all its users. Since %0 is a phi instruction, a load instruction is inserted in the predecessor block, which is the entry basic block. Then it inserts a store instruction to save the value it is demoting to the stack. In this case, %call.i.i.i14.i.i is an invoke, so it creates a basic block that splits the critical edge (entry-do.body.i.i.i), and inserts the store there. This is incorrect because the store instruction should be executed before the load instruction.

define void @_Z4foo1c(i8 signext %a) {
entry:

...
%call.i.i.i14.i.i = invoke noalias i8* @_Znwm(i32 1024)
        to label %do.body.i.i.i unwind label %lpad.body

do.body.i.i.i: ; preds = %entry, %_ZNSt3__116allocator_traitsINS_9allocatorIcEEE9constructIccEEvRS2_PT_RKT0_.exit.i.i.i

%0 = phi i8* [ %incdec.ptr.i.i.i, %_ZNSt3__116allocator_traitsINS_9allocatorIcEEE9constructIccEEvRS2_PT_RKT0_.exit.i.i.i ], [ %call.i.i.i14.i.i, %entry ]
...

The IR after SjLjEHPrepare pass looks like this:

define void @_Z4foo1c(i8 signext %a) {
entry:

%call.i.i.i14.i.i.reg2mem = alloca i8*
 ...
%call.i.i.i14.i.i.reload = load volatile i8** %call.i.i.i14.i.i.reg2mem
...
%call.i.i.i14.i.i = invoke noalias i8* @_Znwm(i32 1024)
        to label %entry.do.body.i.i.i_crit_edge unwind label %lpad.body

entry.do.body.i.i.i_crit_edge: ; preds = %entry

store i8* %call.i.i.i14.i.i, i8** %call.i.i.i14.i.i.reg2mem
br label %do.body.i.i.i

do.body.i.i.i: ; preds = %entry.do.body.i.i.i_crit_edge, %_ZNSt3__116allocator_traitsINS_9allocatorIcEEE9constructIccEEvRS2_PT_RKT0_.exit.i.i.i

%3 = phi i8* [ %incdec.ptr.i.i.i, %_ZNSt3__116allocator_traitsINS_9allocatorIcEEE9constructIccEEvRS2_PT_RKT0_.exit.i.i.i ], [ %call.i.i.i14.i.i.reload, %entry.do.body.i.i.i_crit_edge ]

This patch fixes the bug by changing llvm::DemoteRegToStack to insert the store instruction and create basic blocks that split critical edges first, and then insert the instructions to reload the value.

http://reviews.llvm.org/D6729

Files:

lib/Transforms/Utils/DemoteRegToStack.cpp
test/CodeGen/ARM/sjlj-prepare-critical-edge.ll

EMAIL PREFERENCES

http://reviews.llvm.org/settings/panel/emailpreferences/

llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

+ StoreInst *Store = new StoreInst(&I, Slot, InsertPt);
+
+ Change all of the users of the instruction, except the Store we've
just
+
inserted, to read from the stack slot.
+ SmallVector<User*, 4> UsersOfI(I.users());
+
+ for (User *U : UsersOfI) {
+ if (PHINode *PN = dyn_cast<PHINode>(U)) {

Can you avoid the extra copy? What if you create the store now, but
don't insert it yet? (I think InsertPt can't be one of the users, correct?)

I'm probably not understanding your suggestion, but is it possible to create the Store instruction without adding it the UseList of "I"? Also, I think InsertPt can be one of the users if the def (instruction "I") is immediately followed by a user of "I".

If that doesn't work, I was thinking maybe I can just split it into two parts. The first part just splits the critical edge and fixes the CFG, if that's necessary. After that, load instructions are inserted before all the users. The last step computes the insertion point and inserts the store instruction there.

Akira Hatanaka wrote:

+ StoreInst *Store = new StoreInst(&I, Slot, InsertPt);

+
+  // Change all of the users of the instruction, except the Store we've

just

+  // inserted, to read from the stack slot.
+  SmallVector<User*, 4>  UsersOfI(I.users());
+
+  for (User *U : UsersOfI) {
+    if (PHINode *PN = dyn_cast<PHINode>(U)) {

Can you avoid the extra copy? What if you create the store now, but

don't insert it yet? (I think InsertPt can't be one of the users, correct?)

I'm probably not understanding your suggestion, but is it possible to create the Store instruction without adding it the UseList of "I"?

Yes. "StoreInst *SI = net StoreInst(Val, Ptr);" will create it without
inserting it anywhere, then you can insert it later via.
SI->insertBefore(Inst) or insertAfter(Inst).

Also, I think InsertPt can be one of the users if the def (instruction
"I") is immediately followed by a user of "I".

If that doesn't work, I was thinking maybe I can just split it into two parts. The first part just splits the critical edge and fixes the CFG, if that's necessary. After that, load instructions are inserted before all the users. The last step computes the insertion point and inserts the store instruction there.

http://reviews.llvm.org/D6729

EMAIL PREFERENCES

http://reviews.llvm.org/settings/panel/emailpreferences/
ahatanak updated this revision to Diff 18271.Jan 15 2015, 4:44 PM

Patch updated.

I moved the piece of code which splits the critical edge before the while loop which replaces all the user of I. I believe this should work, although this is probably different from the solution you suggested. Please let me know if I completely misunderstood your comments.

I didn't quite understand your comment about creating the store instruction and inserting it later. You can create the store without inserting it into the basic block, but you are still adding the store to the use list of instruction "I", no? If so, the while loop which replaces all the uses of "I" has to replace the value operand of the store (because the loop won't terminate otherwise), which is not what we want to do.

Also, in the case where the use immediately follows the def, I think the insertion point of the store has to be recomputed after the use is replaced and the load is inserted (I'm assuming it's legitimate to use DemoteRegToStack in such case).

  1. before transformation.

v0 = Def
Use v0 <- store's insertPt

  1. after replacing use of v0.

v0 = Def
v1 = load stackslot
Use v1 <- store's insertPt

  1. after inserting store.

v0 = Def
v1 = load stackslot
store v0, stackslot <- this should be inserted before the load
Use v1

ahatanak updated this revision to Diff 19253.Feb 3 2015, 11:38 AM

Added a line to silence possible warning from release builds.

Nick, do you have further comments?

Akira Hatanaka wrote:

Added a line to silence possible warning from release builds.

Nick, do you have further comments?

Nope! LGTM

Thanks, I'll commit this patch shortly.