This is an archive of the discontinued LLVM Phabricator instance.

[BasicAA] Don't extend pointer size
ClosedPublic

Authored by nikic on Sep 28 2021, 2:20 PM.

Details

Summary

BasicAA GEP decomposition currently performs all calculation on the maximum pointer size, but at least 64-bit, with an option to double the size. The code comment claims that this improves analysis power when working with uint64_t indices on 32-bit systems. However, I don't see how this can be, at least while maintaining correctness:

When working on canonical code, the GEP indices will have GEP index size. If the original code worked on uint64_t with a 32-bit size_t, then there will be truncs inserted before use as a GEP index. Linear expression decomposition does not look through truncs, so this will be an opaque value as far as GEP decomposition is concerned. Working on a wider pointer size does not help here (or have any effect at all).

When working on non-canonical code (before first InstCombine), the GEP indices are implicitly truncated to GEP index size. The BasicAA code currently just ignores this fact completely, and pretends that this truncation doesn't happen. This is incorrect.

I believe that for correctness reasons, it is important to work on the actual GEP index size to properly model potential overflow. BasicAA tries to patch over the fact that it uses the wrong size (see adjustToPointerSize), but it only does that in limited cases (only for constant values, and not all of them either). I'd like to move this code towards always working on the correct size, and dropping these artificial pointer size adjustments is the first step towards that.

Diff Detail

Event Timeline

nikic created this revision.Sep 28 2021, 2:20 PM
nikic requested review of this revision.Sep 28 2021, 2:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 28 2021, 2:20 PM
nikic updated this revision to Diff 376369.Sep 30 2021, 2:30 PM
nikic edited the summary of this revision. (Show Details)

Rebase

When working on non-canonical code (before first InstCombine), the GEP indices are implicitly truncated to GEP index size. The BasicAA code currently just ignores this fact completely, and pretends that this truncation doesn't happen. This is incorrect.

Agreed. For reference, this was added in D38662.

I'd like to move this code towards always working on the correct size, and dropping these artificial pointer size adjustments is the first step towards that.

I tried to add a test showing the issue in 413b7ac6b535, but it still gets it wrong with this patch I think. Do you know which other parts still need adjusting?

nikic added a comment.Oct 1 2021, 3:52 AM

I'd like to move this code towards always working on the correct size, and dropping these artificial pointer size adjustments is the first step towards that.

I tried to add a test showing the issue in 413b7ac6b535, but it still gets it wrong with this patch I think. Do you know which other parts still need adjusting?

We still need to model the actual truncation behavior. Here's my current WIP patch for that: https://gist.github.com/nikic/a9d40f6c739d60e0b6c66ce9edda4363 And there's a few more issues, e.g. our "all positive" code doesn't account for the fact that multiplying / adding positive numbers might make them non-positive.

nikic updated this revision to Diff 376602.Oct 1 2021, 12:23 PM

Rebase over additional test, to show a case this fixes without additional truncation handling.

fhahn accepted this revision.Oct 5 2021, 1:18 AM

LGTM, thanks! Might be good to wait a day or two with landing in case there are any more comments from the people who were involved D38662

This revision is now accepted and ready to land.Oct 5 2021, 1:18 AM
This revision was automatically updated to reflect the committed changes.