This is an archive of the discontinued LLVM Phabricator instance.

[Attributor] Heap-To-Stack Conversion
ClosedPublic

Authored by sstefan1 on Jul 29 2019, 10:40 AM.

Details

Summary

D53362 gives a prototype heap-to-stack conversion pass. With addition of new attributes in the attributor, this can now be revisted and improved. This will place it in the Attributor to make it easier to use new attributes (eg. nofree, nosync, willreturn, etc.) and other attributor features.

I'm starting with some test cases (more will be added along the way). For now they are in FunctionAttrs, as most other attributor tests, but that can be changed.

Event Timeline

sstefan1 created this revision.Jul 29 2019, 10:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 29 2019, 10:40 AM
sstefan1 updated this revision to Diff 212221.Jul 29 2019, 1:57 PM
chenge function names

Negative tests:
Malloc in loop body
Irreducible cfg

?

Top-level functions declarations should have attributes to make them as restricted as needed, e.g. sync_func should probably be nofree or renamed to sync_free_func.

llvm/test/Transforms/FunctionAttrs/heap_to_stack.ll
49

What is the difference between Test 2 and 3?

sstefan1 marked an inline comment as done.Jul 30 2019, 4:32 AM

Top-level functions declarations should have attributes to make them as restricted as needed, e.g. sync_func should probably be nofree or renamed to sync_free_func.

Yes. I'll refine them along the way.

Negative tests:
Malloc in loop body
Irreducible cfg

?

Thanks for suggestions. These will also be added. I will probably put some code today, so I can actually test this on something.

llvm/test/Transforms/FunctionAttrs/heap_to_stack.ll
49

Nothing :). I meant to put call to nosync function here. I'll correct this.

sstefan1 updated this revision to Diff 215801.Aug 18 2019, 2:49 PM

Providing an implementation for the conversion. This is few commits behind fromt the latest Attributor, will be rebased.

sstefan1 updated this revision to Diff 215803.Aug 18 2019, 2:51 PM
  • minor fix
jdoerfert added inline comments.Aug 19 2019, 11:18 AM
llvm/lib/Transforms/IPO/Attributor.cpp
2819

Not needed, see my comment further below.

2824

Not needed, we can use the InfoCache to look only at calls.

2830

Let's collect all valid h2s candidate malloc calls and then only "materialize" the ones we deem profitable. Knowing a malloc could be converted to the stack is already a nice piece information.

3047

If you want to start with a small patch, just check the function for willreturn, nounwind, and nofree.

3074

I dislike this exploration. We should not need to traverse all paths all the time. (This does not track visited blocks btw. so it is probably an endless loop if there are loops in the function).

We should (later) follow the uses of the malloc only. The logic I have in mind is as follows (IMHO much easier than this (=the old logic)):

If all (transitive) users of a malloc are either nocapture uses or free uses, we can do heap2stack. In fact, we should probably generalize the capture AA to collect capturing uses instead so we can analyze them later, e.g., here.

llvm/test/Transforms/FunctionAttrs/heap_to_stack.ll
18

I think this is a bug, see below.

I'll create a patch to *not* annotate free with nocapture.

26

I'd argue, "freeing" a pointer is "capturing" by nature so this combination "nocapture" + "free" cannot exist. If it doesn't, "nocapture" is marked and implies "nofree" on the pointer. Thus, we can actually allocate %1 on the stack here.

sstefan1 updated this revision to Diff 216967.Aug 23 2019, 3:20 PM

Taking a different approach using nocapture and nofree.

sstefan1 updated this revision to Diff 216968.Aug 23 2019, 3:22 PM
remove llvm_debug
Harbormaster completed remote builds in B37224: Diff 216968.

We need tests where stuff is read/written as well.

llvm/lib/Transforms/IPO/Attributor.cpp
2719

Mallocs can become live later so we shouldn't remove them but only ignore them for the moment. In fact, the checkForAllCallLikeInstructions call will already filter "currently assumed dead" ones out so you have to invoke it in every update or collect all calls through the InfoCache explicitly in the initialize method.

2724

This is quadratic in the number of malloc calls.

If you want to exclude malloc calls from the tests below do:
if (MallocCalls.count(&I)) return true;

2729

No need to check, just put it in FreesForMalloc (though I'm unclear why we need it at the moment)

2773

I was hoping we follow and check uses:

UsesCheck = (Instruction &I) {
  SmallPtrSet<const Use *, 8> Visisted;
  SmallVector<const Use *, 8> Worklist;
  Worklist.append(I.use_begin(), I.use_end());
  while (!Worklist.empty()) {
    const Use *U = Worklist.pop_back_val();
    if (!Visited.insert(U).second) continue;
    const auto *UserI = U->getUser();
    if (isa<LoadInst>(UserI) || isa<StoreInst>(UserI))
     continue;
    if (auto *CB = dyn_cast<CallBase>(UserI)) {
      // TODO: Check use once we have nofree on call site args.
      const auto &NoFreeAA = A.getAAFor<AANoFree>(..., IRPosition::call_site(CB));
      if (!NoFreeAA.isAssumedNoFree()) return false;
      // same for nocapture
      // if both are good, continue.
    }
    if (isa<GetElementOperator>(UserI) || isa<BitCastInst>(UserI)) {
      // Add users to worklist and continue
    }
    // unknown user
    return false;
  }
  return true;
}

MalloCheck = (Instruction &I) {
  if (!isaMalloc(I)) return true;
  if (BadMallocs.count(&I)) return true;
  // assumed h2s malloc I
  if (UsesCheck(I))
    Mallocs.insert(&I);
  else
    BadMallocs.insert(&I);
  return true;
}

checkForAllCallLikeInstructions(MallocCheck)
2787

Is this "Number of mallocs converted to allocs"?

2792

I doubt it makes sense to have a H2S call site AA. We should probably only allow the function one.

3260

Still use the Whitelist and group it with the other function AAs please

llvm/test/Transforms/FunctionAttrs/heap_to_stack.ll
41

We need a test, sync_will_return before free which should (later) be OK.

121

This should be h2s as the free is dead and there are no users that are nofree.

sstefan1 updated this revision to Diff 217286.Aug 26 2019, 6:02 PM

addressing comments

sstefan1 updated this revision to Diff 217287.Aug 26 2019, 6:07 PM
  • remove mistake
sstefan1 marked an inline comment as done.Aug 26 2019, 6:20 PM
sstefan1 added inline comments.
llvm/lib/Transforms/IPO/Attributor.cpp
2729

The reason for this is if for example a malloc becomes 'bad' then all the frees that use it can be easily disregarded.

sstefan1 updated this revision to Diff 217518.Aug 27 2019, 3:52 PM
  • Small corrections.

There are some leftover comments but this is almost ready.

llvm/lib/Transforms/IPO/Attributor.cpp
129

I'd enable this by default but for a smaller size, let's say 128?

2652

AI = new BitCastInst(AI, MallocCall->getType(), "malloc_bc", AI->getNextNode());

might do the trick.

2673

remove the const here then you can avoid the const casts

2719

If no-free is set nocaputre is not checked anymore. We need to check for nofree and nocapture.

2724

If this is not the arg operand we should give up. (we may allow it at the callee position but not in the operand bundle)

llvm/test/Transforms/FunctionAttrs/heap_to_stack.ll
59

outdated comment.

69

maybe elaborate why h2s is still OK.

104

You want branch %6 or you would have a double free (not necessarily a problem for h2s but still).

122

leftover comment.

128

We need this test again with an nounwind @foo which should allow h2s.

Negative tests:
Malloc in loop body
Irreducible cfg

?

Still missing.

+ support for calloc shouldn’t be very hard to add.

There are some leftover comments but this is almost ready.

I'd like to see tests with https://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic

@xbolva00 I'm sorry this has dragged on a bit.

I will address all the comments, probably tomorrow.

sstefan1 updated this revision to Diff 219196.Sep 6 2019, 4:16 PM

addressing comments

sstefan1 updated this revision to Diff 219202.Sep 6 2019, 4:43 PM

small fix

Is this in a state to be reviewed or do you plan to go over it again?

llvm/test/Transforms/FunctionAttrs/heap_to_stack.ll
252

check lines missing.

jdoerfert accepted this revision.Sep 12 2019, 5:47 PM

Can you rebase and remove the artifacts in the diff? I also provided final comments. I think once they are fixed it is fine. Feel free to commit a fixed version.

llvm/lib/Transforms/IPO/Attributor.cpp
2664

Leftover comment and braces around single stmt.

2745

this has to lead to invalidation. use in a operand bundle could free even if the function is nofree. Maybe move the check to the beginning of the CallBase conditional.

2765

Allow lifetime.start/end as users here as well. That should unbreak one of the test cases.unbreak

2780

You have to put them into the "Bad" instruction set here. Also, move the "is bad" check to the beginning.
There is a test missing for a malloc and a calloc that have to large sizes. (calloc only the multiplication should be too large).
We need to check that the multiplication does not overflow. APInt has a method to do overflow checked multiplication.

llvm/test/Transforms/FunctionAttrs/heap_to_stack.ll
2

When we run this with the old PM, what will happen? No h2s? We should do it and make sure it doesn't crash etc.

203

This should be fine. Also a lifetime.end user is fine.

This revision is now accepted and ready to land.Sep 12 2019, 5:47 PM
sstefan1 marked 3 inline comments as done.Sep 14 2019, 2:21 PM
sstefan1 added inline comments.
llvm/lib/Transforms/IPO/Attributor.cpp
2745

It makes sense to me to move the check to the beginning.

use in a operand bundle could free even if the function is nofree

I'm not I understand this, though.

2780

Also, move the "is bad" check to the beginning.

If you mean in initialize, then I'd say this is a nicer approach.

llvm/test/Transforms/FunctionAttrs/heap_to_stack.ll
2

Yes, no h2s and doesn't crash.

This revision was automatically updated to reflect the committed changes.

I would prefer to use update_test_checks.py instead of hand written checks.

Quite questionable is “stackification” of mallocs in loops, we should avoid it I think. @efriedma, what do you think?

xbolva00 added a comment.EditedSep 15 2019, 3:10 PM

IsMallocLikeFn returns true for C++ operator new which does throw on an allocation failure.

So probably you stackify:

int *p = new int;
free(p);

It is a stupid code and buggy, but please add it as test case and the following fix:

For now, just check if (isMallocLike && !isOpNewLike).

I think it is worth to add support for c++’s new - delete pairs in the future..

Quite questionable is “stackification” of mallocs in loops, we should avoid it I think. @efriedma, what do you think?

Please can you be more specific?

IsMallocLikeFn returns true for C++ operator new which does throw on an allocation failure.

Likewise, please can you be more specific in your messages? This doesn't actually explain what the problem is.

xbolva00 added inline comments.Sep 15 2019, 3:17 PM
llvm/trunk/lib/Transforms/IPO/Attributor.cpp
3229 ↗(On Diff #220255)

Check for overflow is missing

3263 ↗(On Diff #220255)

Builder.CreateMemset

I think the current implementation needs some important fixes to avoid miscompilation/overflow, can you make it off by default, fix issues, and enable it again?

Thanks.

xbolva00 added a comment.EditedSep 15 2019, 3:41 PM

Quite questionable is “stackification” of mallocs in loops, we should avoid it I think. @efriedma, what do you think?

Please can you be more specific?

Yes :)
Heap to stack idea started some time ago and @efriedma raised some concerns: https://reviews.llvm.org/D47345#1112573.
Imagine

for (int i = 0; i < BIG_NUM; ++i) {

void *p = malloc(128);
....
free(p);

}

->

for (int i = 0; i < BIG_NUM; ++i) {

void *p = alloca(128);
....

}

Nice stack overflow :)

And yes, we shouldn't do this on irreducible cfgs (also noted by Eli).

IsMallocLikeFn returns true for C++ operator new which does throw on an allocation failure.

Likewise, please can you be more specific in your messages? This doesn't actually explain what the problem is.

%1 = tail call dereferenceable(4) i8* @_Znwm(i64 4)
...some code...
tail call void @free(i8* nonnull %1)

is probably converted to alloca. I have no strong opinion what should be done here, ideally warning! :) Anyway, I would like to see some tests with C++'s new and various combinations (there are many variants of 'new', 'delete').

sstefan1 marked an inline comment as done.Sep 15 2019, 11:07 PM

I think the current implementation needs some important fixes to avoid miscompilation/overflow, can you make it off by default, fix issues, and enable it again?

Thanks.

Attributor is turned if by dedault right now, so h2s is also turned off. I will revisit this again this week and try to address most of the concerns you raised in a new patch.

llvm/trunk/lib/Transforms/IPO/Attributor.cpp
3229 ↗(On Diff #220255)

If there is an overflow, this will not be executed.

xbolva00 added inline comments.Sep 15 2019, 11:12 PM
llvm/trunk/lib/Transforms/IPO/Attributor.cpp
3229 ↗(On Diff #220255)

Ok, great

sstefan1 marked an inline comment as done.Sep 15 2019, 11:14 PM
sstefan1 added inline comments.
llvm/trunk/lib/Transforms/IPO/Attributor.cpp
3363 ↗(On Diff #220255)

This was supposed to be if(Overflow). I'll fix this.

I think the current implementation needs some important fixes to avoid miscompilation/overflow, can you make it off by default, fix issues, and enable it again?

The Attributor is off by default (even though we should now pass LLVM-TS + SPEC2006 tests!!).

IsMallocLikeFn returns true for C++ operator new which does throw on an allocation failure.

[...] is probably converted to alloca. I have no strong opinion what should be done here, ideally warning! :) Anyway, I would like to see some tests with C++'s new and various combinations (there are many variants of 'new', 'delete').

I fail to see the problem in converting a possible throwing new to a non-throwing alloca.
If you think there is one could you please try to explain why that should not be valid?

I think it is worth to add support for c++’s new - delete pairs in the future..

We can add this (the delete part) when we actually use free/delete to reason about h2s validity.

Quite questionable is “stackification” of mallocs in loops, we should avoid it I think. @efriedma, what do you think?

Heap to stack idea started some time ago and @efriedma raised some concerns: https://reviews.llvm.org/D47345#1112573.

From a pure standard point of view, I would argue the current implementation is not wrong in doing h2s for mallocs in loops.
I do however not intent to turn it on and let people deal with the aftermath. As mentioned by @sstefan1, this is not on by default.
My plan was to get the logic in, add the "must-be-freed"-based logic that @hfinkel used in the original version, and then look into heuristics.
The reason is that the "must-be-freed" actually allows different placement of the alloca and we want to see what is out there wrt. sizes etc.
Does that make sense and does my plan sound OK to you?

llvm/trunk/lib/Transforms/IPO/Attributor.cpp
3363 ↗(On Diff #220255)

Add a test for this please.

If you think there is one could you please try to explain why that should not be valid?

Ok, now I think it should be okay.

Does that make sense and does my plan sound OK to you?

Ok.