This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Don't replace allocas with smaller globals
ClosedPublic

Authored by vitalybuka on Jun 16 2017, 5:58 PM.

Details

Summary

InstCombine replaces large allocas with small globals consts causing buffer overflows
on valid code, see PR33372.

This fix permits this optimization only if the global is dereference for alloca size.

Diff Detail

Repository
rL LLVM

Event Timeline

vitalybuka created this revision.Jun 16 2017, 5:58 PM

Removed file

vitalybuka retitled this revision from [InstCombine] Don't replace allocas with globals we didn't prove that the global is large enough. to [InstCombine] Don't replace allocas with globals if we can't prove that it's large enough..Jun 16 2017, 6:13 PM
vitalybuka edited the summary of this revision. (Show Details)
vitalybuka edited the summary of this revision. (Show Details)Jun 16 2017, 6:19 PM

Fixed function name

efriedma added a subscriber: efriedma.

I wonder if it makes sense to more aggressively transform the IR in certain cases rather than bail out, since we can prove that the IR is reading uninitialized memory. Something for a future patch, I guess, if someone has a testcase where it matters in practice.

lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
169 ↗(On Diff #102927)

Messing with getArraySize() is unnecessary; just return false if isArrayAllocation() is true.

193 ↗(On Diff #102927)

If you're going to change the functionality here, you also need to change the name.

test/Transforms/InstCombine/memcpy-from-global.ll
211 ↗(On Diff #102927)

Testcase needs comment explaining what it's checking for.

vitalybuka marked 3 inline comments as done.

efriedma's comments

efriedma edited edge metadata.Jun 20 2017, 12:10 PM

Hmm, actually, thinking about it a bit more, I'm not sure this patch is right.

If you were to assume that out-of-bounds loads return undef rather than have undefined behavior, this patch would be unnecessary. There's only one memcpy that writes to the alloca. That memcpy writes N known bytes to the alloca. Any other instruction which accesses the alloca is either reading one of those N bytes, or returns undef, so the returned values are consistent with reading directly from the global.

Of course, that isn't the world we live in; we can't speculate out-of-bounds loads. Therefore, the safety check needs to make sure we aren't introducing any such loads. There are a couple of ways to do that:

  1. Prove that the constant pointer points to at least sizeof(alloca type) bytes of memory. The loads in the rewritten code are out-of-bounds only if they were out-of-bounds in the original code.
  2. Ensure that every load is dominated by the initializing memcpy, and the memcpy copies at least sizeof(alloca type) bytes of memory. If the memcpy is copying too many bytes, it's immediately undefined behavior. Any load dominated by the memcpy is one of the following: in-bounds in the original code, out-of-bounds in the original code, or dominated by undefined behavior.

This patch implements (2)... but only partially: it's missing the dominance check. Testcase something like this:

const char x[1] = { 3 };
char z[10];
int f(bool this_is_false) {
  char y[10];
  if (this_is_false) memcpy(y, x, 10); 
  memcpy(z, y, 10); 
}
lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
175 ↗(On Diff #103133)

getZExtValue() will crash on very large lengths.

176 ↗(On Diff #103133)

This assertion doesn't seem justified; sure, the memcpy has undefined behavior, but that shouldn't crash the compiler.

vitalybuka marked 2 inline comments as done.Jun 20 2017, 2:11 PM

Hmm, actually, thinking about it a bit more, I'm not sure this patch is right.

If you were to assume that out-of-bounds loads return undef rather than have undefined behavior, this patch would be unnecessary. There's only one memcpy that writes to the alloca. That memcpy writes N known bytes to the alloca. Any other instruction which accesses the alloca is either reading one of those N bytes, or returns undef, so the returned values are consistent with reading directly from the global.

We have problem with code like this which just crashes even without sanitizers:

const char KKK[] = {1, 1, 2};
char bbb[100000];

void test2() {
  char cc[sizeof(bbb)];
  memcpy(cc, KKK , sizeof(KKK));
  memcpy(bbb, cc, sizeof(bbb));
}

Of course, that isn't the world we live in; we can't speculate out-of-bounds loads. Therefore, the safety check needs to make sure we aren't introducing any such loads. There are a couple of ways to do that:

  1. Prove that the constant pointer points to at least sizeof(alloca type) bytes of memory. The loads in the rewritten code are out-of-bounds only if they were out-of-bounds in the original code.

Yes, but seems this requires more work, and I will have no time to do that any time soon.

  1. Ensure that every load is dominated by the initializing memcpy, and the memcpy copies at least sizeof(alloca type) bytes of memory. If the memcpy is copying too many bytes, it's immediately undefined behavior. Any load dominated by the memcpy is one of the following: in-bounds in the original code, out-of-bounds in the original code, or dominated by undefined behavior.

This patch implements (2)... but only partially: it's missing the dominance check. Testcase something like this:

const char x[1] = { 3 };
char z[10];
int f(bool this_is_false) {
  char y[10];
  if (this_is_false) memcpy(y, x, 10); 
  memcpy(z, y, 10);
}

This is probably separate issue. And I am not sure if understand the problem. If we get to "memcpy(z, y, 10);" without "memcpy(y, x, 10);" I'd expect we don't care if "y" is uninitialized bytes or global constant. We will have no buffer overflow which I am trying to fix.

Removed assert and getZExtValue

If we get to "memcpy(z, y, 10);" without "memcpy(y, x, 10);" I'd expect we don't care if "y" is uninitialized bytes or global constant. We will have no buffer overflow which I am trying to fix.

If "this_is_false" is true, the function has undefined behavior, if it's false, it overwrites z with uninitialized memory, which is fine (in IR). But it incorrectly passes the isCompletelyOverwritten() check, so instcombine will transform it to "memcpy(z, x, 10);", which is reading past the end of the global.

If we get to "memcpy(z, y, 10);" without "memcpy(y, x, 10);" I'd expect we don't care if "y" is uninitialized bytes or global constant. We will have no buffer overflow which I am trying to fix.

If "this_is_false" is true, the function has undefined behavior, if it's false, it overwrites z with uninitialized memory, which is fine (in IR). But it incorrectly passes the isCompletelyOverwritten() check, so instcombine will transform it to "memcpy(z, x, 10);", which is reading past the end of the global.

Oh, I see now, I didn't noticed that "if" already had overflow.

  1. Prove that the constant pointer points to at least sizeof(alloca type) bytes of memory. The loads in the rewritten code are out-of-bounds only if they were out-of-bounds in the original code.

I will try to look into this. Any pointers on how to find available size in the constant? I guess we need to handle the case when memcpy source points into a middle of the global.

There's a isDereferenceableAndAlignedPointer helper which does this sort of check. See also https://reviews.llvm.org/D34467.

vitalybuka retitled this revision from [InstCombine] Don't replace allocas with globals if we can't prove that it's large enough. to [InstCombine] Don't replace allocas with smaller globals.Jun 23 2017, 5:05 PM
vitalybuka edited the summary of this revision. (Show Details)

Use isDereferenceableAndAlignedPointer to check global size

efriedma added inline comments.Jun 23 2017, 5:22 PM
lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
180 ↗(On Diff #103811)

Doesn't the alignment here need to be the alignment of the alloca, rather than "1"?

Use alloca alignment

vitalybuka marked an inline comment as done.Jun 23 2017, 5:50 PM
vitalybuka added inline comments.Jun 23 2017, 5:55 PM
lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
180 ↗(On Diff #103811)

I'd expect it's not important as alignment of the source is not less than alloca.

vitalybuka marked an inline comment as done.Jun 23 2017, 5:56 PM

LGTM.

lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
180 ↗(On Diff #103811)

If you don't check the alignment, you need to check that the memcpy dominates the reads from the alloca.

efriedma accepted this revision.Jun 23 2017, 6:14 PM
This revision is now accepted and ready to land.Jun 23 2017, 6:14 PM