Page MenuHomePhabricator

Run bounds checking sanitizer earlier to make it easier to optimize away its checks.
AbandonedPublic

Authored by jgalenson on Jul 18 2018, 9:13 AM.

Details

Reviewers
eugenis
chandlerc
Summary

Running the bounds checking sanitizer earlier makes it easier for other optimizations to remove the checks it inserts.
While it could also inhibit other optimizations, I ran a few benchmarks and this did seem to improve performance and code size slightly.

Note that I'm not sure how to hook this up to the new PM, as I couldn't find a similar hook into the beginning of the pipeline. Are there any suggestions on what to do about that, or is it fine for this to work just on the old PM?

Diff Detail

Event Timeline

jgalenson created this revision.Jul 18 2018, 9:13 AM

Are you sure this will actually do what you want, in general? I suspect it will end up missing bounds checks in some cases because it's running it too early (before mem2reg/inlining/etc).

Are you sure this will actually do what you want, in general? I suspect it will end up missing bounds checks in some cases because it's running it too early (before mem2reg/inlining/etc).

No, I'm not sure; that's one reason I'm asking for comments. But I don't see any specific problems. For example, I don't see why inlining would matter; the checks should still be added, just before inlining instead of after (which of course affects the inlining heuristic, but that's another matter). I don't understand mem2reg as well, though. Do you have specific examples you think might fail?

I was thinking about this in terms of other sanitizers I know, specifically the integer overflow sanitizer. That adds overflow checks in Clang, which is before all of these LLVM passes. So my thought was that moving bounds checks to be inserted earlier brings it closer to how the integer overflow sanitizer works.

ObjectSizeOffsetEvaluator may fail to trace the address back to the underlying object, and we will end up not inserting the check.
Does ChecksUnable statistic go up with this change?

This sanitizer has a bit of a strange design compared to other sanitizers; it tries to compute the size of the base object using the IR at the point the pass runs. So the later it runs, the more information it has. Trivial example:

static int accumulate(int* foo, int n) {
  int sum = 0;
  for (unsigned i = 0; i < n; i++)
    sum += foo[i];
  return sum;
}
extern void fill(int *arr, int n);
int dosum(int n) {
  int foo[1000];
  fill(foo, n);
  return accumulate(foo, n);
}

Some checks can be removed by asking SCEV for offset bounds, see SafeStack::IsAccessSafe for an example.

Good call; I had figured that running it earlier might just impede other optimizations, but I forgot that it would also hide size information.

I thought this was the easiest approach compared to changing the pass pipeline or adding extra checks in here. But I hadn't realized I could use SCEV that easily. From trying it quickly, I think this can remove the check in this testcase (and even some others this couldn't remove). I'll work on making a patch that does that instead.

Thanks for the quick feedback and suggestions!

jgalenson abandoned this revision.Jul 24 2018, 8:24 AM

I just committed https://reviews.llvm.org/rL337830, which solves this problem the way eugenis suggested.