This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Revert rL226781 "Teach InstCombine to canonicalize loads which are only ever stored to always use a legal integer type if one is available." (PR47592)
ClosedPublic

Authored by lebedev.ri on Oct 3 2020, 1:14 PM.

Details

Summary

(it was introduced in https://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html)

This canonicalization seems dubious.

Most importantly, while it does not create inttoptr casts by itself,
it may cause them to appear later, see e.g. D88788.

I think it's pretty obvious that it is an undesirable outcome,
by now we've established that seemingly no-op inttoptr/ptrtoint casts
are not no-op, and are no longer eager to look past them.
Which e.g. means that given

%a = load i32 
%b = inttoptr %a
%c = inttoptr %a

we likely won't be able to tell that %b and %c is the same thing.

We could of course try to cleanup the IR afterwards, by enhancing
the cast-of-load transform to deal with non-single-use loads,
and i even tried that already in D75505, as it was rightfully
pointed out, that is very much not compile-time free:
https://llvm-compile-time-tracker.com/compare.php?from=871d03a6751e0f82e210c80a881ef357c5633a26&to=782be5b99377b62e998e4157ddede0fa296664b5&stat=instructions

Thusly, i'd propose to simply not perform such a canonicalization.
The original motivational RFC does not state what larger problem that canonicalization
was trying to solve, so i'm not sure how this plays out in the larger picture.

Does anyone have any thoughts?

See https://bugs.llvm.org/show_bug.cgi?id=47592

Diff Detail

Event Timeline

lebedev.ri created this revision.Oct 3 2020, 1:14 PM
nlopes accepted this revision.Oct 3 2020, 2:58 PM

Love it, thanks!
This gets rid of a lot of type punning issues through load/store of integers. Not introducing inttoptr during optimization is a very healthy goal.

This revision is now accepted and ready to land.Oct 3 2020, 2:58 PM
nikic added a subscriber: nikic.Oct 4 2020, 12:54 AM

as it was rightfully pointed out, that is very much not compile-time free: https://llvm-compile-time-tracker.com/compare.php?from=871d03a6751e0f82e210c80a881ef357c5633a26&to=782be5b99377b62e998e4157ddede0fa296664b5&stat=instructions

Looks free to me?

In any case, this change looks reasonable to me. GVN has no problems deduplicating load/stores from different types (https://llvm.godbolt.org/z/5nTjWE), so I'm not sure what this canonicalization was useful for.

FWIW, I still very much feel that this is the correct canonicalization, and that downstream problems *must* be fixed downstream. Avoiding this canonicalization doesn't actually fix them, it just makes us less *aware* of the problems that still fundamentally exist. =[

That said, I'm not heavily involved in LLVM, and so if everyone currently involved thinks this is a good change, I'm not going to stand in the way. It just makes no sense to me.

Rebase/fix remaining tests.

Not introducing inttoptr during optimization is a very healthy goal.

Thank you for pointing that out.
Indeed, that is very precisely my goal here.

We can revisit that patch afterwards.

In any case, this change looks reasonable to me. GVN has no problems deduplicating load/stores from different types (https://llvm.godbolt.org/z/5nTjWE), so I'm not sure what this canonicalization was useful for.

Yep.

FWIW, I still very much feel that this is the correct canonicalization, and that downstream problems *must* be fixed downstream. Avoiding this canonicalization doesn't actually fix them, it just makes us less *aware* of the problems that still fundamentally exist. =[

That said, I'm not heavily involved in LLVM, and so if everyone currently involved thinks this is a good change, I'm not going to stand in the way. It just makes no sense to me.

Thank you for commenting!

Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2020, 10:28 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

FWIW, I still very much feel that this is the correct canonicalization, and that downstream problems *must* be fixed downstream. Avoiding this canonicalization doesn't actually fix them, it just makes us less *aware* of the problems that still fundamentally exist. =[

I'd agree if we excluded all pointers from canonicalization. But the semantics of inttoptr and inttoptr-equivalent memory operations are weird; in general, I'm not sure we can recover the original semantics of the code if we throw away the pointer-ness of pointer load/store operations.

To address the issue at hand, I think changing the isNonIntegralPointerType() check to just isPtrOrPtrVectorTy() would be enough. I think that might make sense?

To address the issue at hand, I think changing the isNonIntegralPointerType() check to just isPtrOrPtrVectorTy() would be enough. I think that might make sense?

I briefly considered that, but the motivational example in https://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html
was really about pointer-typed things. So while we could do that, i'm not sure how useful this fold really is overall.

FWIW, I still very much feel that this is the correct canonicalization, and that downstream problems *must* be fixed downstream. Avoiding this canonicalization doesn't actually fix them, it just makes us less *aware* of the problems that still fundamentally exist. =[

I'd agree if we excluded all pointers from canonicalization. But the semantics of inttoptr and inttoptr-equivalent memory operations are weird; in general, I'm not sure we can recover the original semantics of the code if we throw away the pointer-ness of pointer load/store operations.

To address the issue at hand, I think changing the isNonIntegralPointerType() check to just isPtrOrPtrVectorTy() would be enough. I think that might make sense?

Keeping loads and stores of pointers as pointers to the extent possible doesn't seem like a bad idea, but I'm worried people will feel like this gives a *semantic* guarantee that isn't really there. Fundamentally, LLVM still doesn't currently have typed memory. All of the optimizer is built upon this assumption.

Anyways, while it doesn't seem intrinsically bad to preserve pointer types as much as possible, I feel like the underlying problem should be addressed in a more fundamental way -- that this change will just shift the problem to more complex cases where the frontend happens to use a memcpy or something similar. I wonder if revisiting D75505 makes somewhat more sense, although clearly it would need some different approach to address the compile time issues.

Keeping loads and stores of pointers as pointers to the extent possible doesn't seem like a bad idea, but I'm worried people will feel like this gives a *semantic* guarantee that isn't really there. Fundamentally, LLVM still doesn't currently have typed memory. All of the optimizer is built upon this assumption.

LLVM currently isn't internally consistent. See https://bugs.llvm.org/show_bug.cgi?id=34548 . I should probably make a LangRef patch so the "pointer aliasing" section indicates there's an issue here.

So, we can't really teach SCEV about this: D88788 (not without the https://bugs.llvm.org/show_bug.cgi?id=47592 at lease)
And we can't recover the situation post-inlining in instcombine: D88842.

It really does look like this fold is actively breaking
otherwise-good IR, in a way that is not recoverable.
And that means, this fold isn't helpful in exposing the passes
that are otherwise unaware of these patterns it produces.

I'm proceeding with this patch.

This comment was removed by lebedev.ri.
nikic added a comment.Oct 5 2020, 2:56 PM

This ended up having a rather large impact...

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=567462b48eba1c2d286ce97117994463f4535d2e&to=e00f189d392dd9bf95f6a98f05f2d341d06cd65c&stat=instructions
Code size: https://llvm-compile-time-tracker.com/compare.php?from=567462b48eba1c2d286ce97117994463f4535d2e&to=e00f189d392dd9bf95f6a98f05f2d341d06cd65c&stat=size-text
Largest code size impact is CMakeFiles/pairlocalalign.dir/constants.c.o with a code-size reduction of ~6%.

Always happy about compile-time improvements, but the large code size changes indicate that this has a pretty significant impact on optimization, and as usual it's hard to tell whether it's a good or a bad one :)