This is an archive of the discontinued LLVM Phabricator instance.

[InstructionCombining] Replace small heap allocations with local variables
AbandonedPublic

Authored by xbolva00 on May 24 2018, 2:02 PM.

Details

Reviewers
efriedma
Summary

Requires constant size to be allocated

After inlining we may got code like this:

int main(void) {

char *ptr = malloc(80* sizeof(char)); // SIZE

strcpy(ptr, "test");
puts(ptr);

free(ptr);

}

if SIZE < threshold (currently 256)

-> create something like "char ptr[size]"

Fires 30+ in MultiSource benchmark.

Tests would be added later, after an initial discussion.

Diff Detail

Event Timeline

xbolva00 created this revision.May 24 2018, 2:02 PM
xbolva00 edited the summary of this revision. (Show Details)May 24 2018, 2:03 PM
xbolva00 retitled this revision from [InstructionCombing] Replace small allocations with local/global variable to [InstructionCombining] Replace small allocations with local/global variable.
xbolva00 updated this revision to Diff 148493.May 24 2018, 2:46 PM
xbolva00 edited the summary of this revision. (Show Details)
xbolva00 updated this revision to Diff 148622.May 25 2018, 9:43 AM
xbolva00 edited the summary of this revision. (Show Details)
xbolva00 added a reviewer: efriedma.

The biggest problem I see with this is the potential to cause a stack overflow; you've specified a limit of 100 bytes per allocation, but there's no limit to the number of allocations. Not sure what the right solution is; LLVM doesn't really have good heuristics here.

lib/Transforms/InstCombine/InstructionCombining.cpp
2267

Please make this an option, so it can be tweaked for debugging.

2299

This isn't sufficient to guarantee malloc will be called once; getLoopFor won't find irreducible loops.

2314

You probably don't want to insert an alloca in the middle of a function; allocas outside the entry block are treated as dynamic allocations, so they're slower, and won't work the way you want with code that uses llvm.stackrestore.

Maybe introduce local cache to track:
Function --- Number of malloc -> alloca?

Or allow it only once per function? Or again some treshold value.

lib/Transforms/InstCombine/InstructionCombining.cpp
2267

Command line option for opt? Ok.

2314

Ok, I will fix it.

xbolva00 added inline comments.May 25 2018, 11:45 AM
lib/Transforms/InstCombine/InstructionCombining.cpp
2299

How can we fix it and detect them?

I read some comments about irreducible loops in getNearestLoop...

xbolva00 updated this revision to Diff 148642.May 25 2018, 11:48 AM

Addressed review comments

xbolva00 edited the summary of this revision. (Show Details)May 25 2018, 11:48 AM
xbolva00 marked 4 inline comments as done.
efriedma added inline comments.May 25 2018, 11:55 AM
lib/Transforms/InstCombine/InstructionCombining.cpp
2266

You can't use a static variable like this; LLVM can run on multiple threads in a program at the same time.

2299

There's a helper containsIrreducibleCFG.

xbolva00 updated this revision to Diff 148648.May 25 2018, 12:21 PM
  • Detect IrreducibleCFG
xbolva00 added a comment.EditedMay 25 2018, 12:21 PM

Maybe the code is ready?

So I can create some tests.

xbolva00 retitled this revision from [InstructionCombining] Replace small allocations with local/global variable to [InstructionCombining] Replace small heap allocations with local variables.May 25 2018, 12:32 PM
efriedma added inline comments.May 25 2018, 12:47 PM
lib/Transforms/InstCombine/InstCombineInternal.h
254

This still doesn't really do what you want: opt -instcombine -instcombine will ignore the limit (and we effectively do that in a lot of places). Maybe you could solve that by moving the transform into its own pass? Not sure; I'll think about it over the weekend and get back to you.

lib/Transforms/InstCombine/InstructionCombining.cpp
2296

This check isn't sufficient to ensure some other function doesn't call free() on the pointer (actually, this doesn't even handle the case where there are two calls to free() along different codepaths).

xbolva00 added inline comments.May 25 2018, 1:00 PM
lib/Transforms/InstCombine/InstCombineInternal.h
254

Sure, thanks

lib/Transforms/InstCombine/InstructionCombining.cpp
2296

Other function...

void free_ptr(void **p);
void capture(void*p);

void small_alloc_3(void) {

void *s = malloc(10);
capture(s);
free(s);

}

void small_alloc_4(void) {

void *s = malloc(10);
free_ptr(&s);
free(s);

}

These case are not optimized (PointerMayBeCaptured solves it)

2296

But yes, the second case is a problem.. Not sure what to do.

void small_alloc_bad2(int n, int i) {

char *s = malloc(10);
puts(s);
if (n)
        free(s);

if (i)
        free(s);

}

xbolva00 added inline comments.May 25 2018, 1:01 PM
lib/Transforms/InstCombine/InstructionCombining.cpp
2296

Check uses of "s"? If there is only one "free" ?

efriedma added inline comments.May 25 2018, 1:10 PM
lib/Transforms/InstCombine/InstructionCombining.cpp
2296

Your "capture" function captures the pointer, but it's possible to write a function which takes the pointer as a parameter but doesn't capture it. For example, given the function void f(void *p) { free(p); }, LLVM will mark the parameter "p" nocapture.

xbolva00 updated this revision to Diff 148655.May 25 2018, 1:27 PM
  • Check number of frees
xbolva00 added inline comments.May 25 2018, 2:04 PM
lib/Transforms/InstCombine/InstructionCombining.cpp
2296

Yes hmm. Maybe extend CaptureTracker with to handle this with a new parameter bool ArgCaptures?

xbolva00 added inline comments.Jun 2 2018, 11:40 AM
lib/Transforms/InstCombine/InstructionCombining.cpp
2265

Or maybe we can turn larger allocations into global variables also? + allocations with unknown (not constant) size?

2283

this looks bad, any advice how to do it properly?

Maybe just remove all free calls in Malloc's uses?

xbolva00 added inline comments.Jun 2 2018, 11:55 AM
lib/Transforms/InstCombine/InstructionCombining.cpp
2265

Ah, possible data race on that global var..

Friendly ping :)

xbolva00 abandoned this revision.Jun 12 2018, 12:26 PM