This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Support noalias calls in foldAllocaCmp()
Changes PlannedPublic

Authored by nikic on Feb 20 2023, 8:56 AM.

Details

Reviewers
reames
spatel
Summary

This is a reboot of D120371, which adds support for noalias calls in foldAllocaCmp(). The only difference to alloca handling is that allocators might return null, in which case we must make sure that we don't fold away comparisons to null.

Once this lands, I'll commit the removal of the old incorrect InstSimplify/CaptureTracking code in a separate commit, to reduce risk.

Diff Detail

Event Timeline

nikic created this revision.Feb 20 2023, 8:56 AM
nikic requested review of this revision.Feb 20 2023, 8:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 20 2023, 8:56 AM
goldstein.w.n added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
6458

Maybe cache these in an optional so the second call to canFold doesn't need to recompute?

nikic added inline comments.Feb 20 2023, 12:06 PM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
6458

These can't be easily cached because they may work on different values (underlying object vs plain operand).

nikic planned changes to this revision.Feb 21 2023, 12:12 AM

I've thought about this a bit more, and I think there's another problem here: Noalias calls are really the wrong criterion for this. Ideally, noalias would only be about provenance, especially now that we have allockind, and this optimization is not provenance-related.

However, even if we switch this to use isAllocationFn() and require that allocators have no heap layout guarantees, then we may run into problems with partial inlining. Consider this cute custom allocator:

// Let's ignore all the missing safety checks here...
void *cached_alloc;
void *my_malloc(size_t size) {
    if (!cached_alloc)
        return cached_alloc;
    return malloc(size);
}
void free(void *ptr) {
    if (ptr == cached_alloc) {
        cached_alloc = NULL;
        return;
    }
    free(ptr);
}

Now, if this gets partially inlined, and we see my_malloc as an allocator, but free is inlined, then we're going to optimize away that ptr == cached_alloc comparison, resulting in a miscompile.

Incidentally, this specific example could already be a problem with the existing InstSimplify code (because it uses the icmp of load of global pattern, just missing the nonnull bit), but this patch has the potential to make things worse, so we probably shouldn't land it in this form.

Probably the proper way to support this would be to add an additional allockind like any_heap_layout and then only annotate non-replacable allocators with it (e.g. malloc but not new) -- though there are still LTO hazards.

In the meantime I've landed https://github.com/llvm/llvm-project/commit/4d192f2d2a545c8cd51ef86f9e83209eccbe229e to move the CaptureTracking special case into the InstSimplify fold, so it at least doesn't affect other code.