This is an archive of the discontinued LLVM Phabricator instance.

[InstructionCombining] Replace small heap allocations with local variables
AbandonedPublic

Authored by xbolva00 on May 24 2018, 2:02 PM.

Details

Reviewers
efriedma
Summary

Requires constant size to be allocated

After inlining we may got code like this:

int main(void) {

char *ptr = malloc(80* sizeof(char)); // SIZE

strcpy(ptr, "test");
puts(ptr);

free(ptr);

}

if SIZE < threshold (currently 256)

-> create something like "char ptr[size]"

Fires 30+ in MultiSource benchmark.

Tests would be added later, after an initial discussion.

Diff Detail

Event Timeline

xbolva00 created this revision.May 24 2018, 2:02 PM
xbolva00 edited the summary of this revision. (Show Details)May 24 2018, 2:03 PM
xbolva00 retitled this revision from [InstructionCombing] Replace small allocations with local/global variable to [InstructionCombining] Replace small allocations with local/global variable.
xbolva00 updated this revision to Diff 148493.May 24 2018, 2:46 PM
xbolva00 edited the summary of this revision. (Show Details)
xbolva00 updated this revision to Diff 148622.May 25 2018, 9:43 AM
xbolva00 edited the summary of this revision. (Show Details)
xbolva00 added a reviewer: efriedma.

The biggest problem I see with this is the potential to cause a stack overflow; you've specified a limit of 100 bytes per allocation, but there's no limit to the number of allocations. Not sure what the right solution is; LLVM doesn't really have good heuristics here.

lib/Transforms/InstCombine/InstructionCombining.cpp
2278

Please make this an option, so it can be tweaked for debugging.

2310

This isn't sufficient to guarantee malloc will be called once; getLoopFor won't find irreducible loops.

2325

You probably don't want to insert an alloca in the middle of a function; allocas outside the entry block are treated as dynamic allocations, so they're slower, and won't work the way you want with code that uses llvm.stackrestore.

Maybe introduce local cache to track:
Function --- Number of malloc -> alloca?

Or allow it only once per function? Or again some treshold value.

lib/Transforms/InstCombine/InstructionCombining.cpp
2278

Command line option for opt? Ok.

2325

Ok, I will fix it.

xbolva00 added inline comments.May 25 2018, 11:45 AM
lib/Transforms/InstCombine/InstructionCombining.cpp
2310

How can we fix it and detect them?

I read some comments about irreducible loops in getNearestLoop...

xbolva00 updated this revision to Diff 148642.May 25 2018, 11:48 AM

Addressed review comments

xbolva00 edited the summary of this revision. (Show Details)May 25 2018, 11:48 AM
xbolva00 marked 4 inline comments as done.
efriedma added inline comments.May 25 2018, 11:55 AM
lib/Transforms/InstCombine/InstructionCombining.cpp
2277

You can't use a static variable like this; LLVM can run on multiple threads in a program at the same time.

2310

There's a helper containsIrreducibleCFG.

xbolva00 updated this revision to Diff 148648.May 25 2018, 12:21 PM
  • Detect IrreducibleCFG
xbolva00 added a comment.EditedMay 25 2018, 12:21 PM

Maybe the code is ready?

So I can create some tests.

xbolva00 retitled this revision from [InstructionCombining] Replace small allocations with local/global variable to [InstructionCombining] Replace small heap allocations with local variables.May 25 2018, 12:32 PM
efriedma added inline comments.May 25 2018, 12:47 PM
lib/Transforms/InstCombine/InstCombineInternal.h
256

This still doesn't really do what you want: opt -instcombine -instcombine will ignore the limit (and we effectively do that in a lot of places). Maybe you could solve that by moving the transform into its own pass? Not sure; I'll think about it over the weekend and get back to you.

lib/Transforms/InstCombine/InstructionCombining.cpp
2307

This check isn't sufficient to ensure some other function doesn't call free() on the pointer (actually, this doesn't even handle the case where there are two calls to free() along different codepaths).

xbolva00 added inline comments.May 25 2018, 1:00 PM
lib/Transforms/InstCombine/InstCombineInternal.h
256

Sure, thanks

lib/Transforms/InstCombine/InstructionCombining.cpp
2307

Other function...

void free_ptr(void **p);
void capture(void*p);

void small_alloc_3(void) {

void *s = malloc(10);
capture(s);
free(s);

}

void small_alloc_4(void) {

void *s = malloc(10);
free_ptr(&s);
free(s);

}

These case are not optimized (PointerMayBeCaptured solves it)

2307

But yes, the second case is a problem.. Not sure what to do.

void small_alloc_bad2(int n, int i) {

char *s = malloc(10);
puts(s);
if (n)
        free(s);

if (i)
        free(s);

}

xbolva00 added inline comments.May 25 2018, 1:01 PM
lib/Transforms/InstCombine/InstructionCombining.cpp
2307

Check uses of "s"? If there is only one "free" ?

efriedma added inline comments.May 25 2018, 1:10 PM
lib/Transforms/InstCombine/InstructionCombining.cpp
2307

Your "capture" function captures the pointer, but it's possible to write a function which takes the pointer as a parameter but doesn't capture it. For example, given the function void f(void *p) { free(p); }, LLVM will mark the parameter "p" nocapture.

xbolva00 updated this revision to Diff 148655.May 25 2018, 1:27 PM
  • Check number of frees
xbolva00 added inline comments.May 25 2018, 2:04 PM
lib/Transforms/InstCombine/InstructionCombining.cpp
2307

Yes hmm. Maybe extend CaptureTracker with to handle this with a new parameter bool ArgCaptures?

xbolva00 added inline comments.Jun 2 2018, 11:40 AM
lib/Transforms/InstCombine/InstructionCombining.cpp
2276

Or maybe we can turn larger allocations into global variables also? + allocations with unknown (not constant) size?

2294

this looks bad, any advice how to do it properly?

Maybe just remove all free calls in Malloc's uses?

xbolva00 added inline comments.Jun 2 2018, 11:55 AM
lib/Transforms/InstCombine/InstructionCombining.cpp
2276

Ah, possible data race on that global var..

Friendly ping :)

xbolva00 abandoned this revision.Jun 12 2018, 12:26 PM