Page MenuHomePhabricator

[InstCombine] Support multiple comparisons in foldAllocaCmp()

Authored by nikic on Feb 21 2023, 7:52 AM.



foldAllocaCmp() needs to fold all comparisons of an alloca at the same time, to ensure that there is a consistent view of the alloca address. Currently, it folds "all" comparisons by limiting to the case where there is only one. This patch switches the algorithm to instead actually collect and fold all comparisons.

Something we need to be careful about here is that there may be comparisons where both sides of the icmp are based on the alloca. Such comparisons are comparing offsets of the alloca, and as such can be ignored here, but shouldn't be folded to false!

Diff Detail

Event Timeline

nikic created this revision.Feb 21 2023, 7:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 21 2023, 7:52 AM
nikic requested review of this revision.Feb 21 2023, 7:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 21 2023, 7:52 AM
goldstein.w.n added inline comments.Mar 20 2023, 9:16 AM

Instead of just deleting this, can you replace it with an equally useful comment about what the new logic is doing?


I don't understand this TODO, isn't getUnderlyingObject(*U) == Alloca doing just that?


Can you add TODO to canonicalize ule/uge/sle/sge -> ult/ugt/slt/sgt?


!CmpInst::isTrueWhenEqual(ICmp->getPredicate()) -> ICmp->getPredicate() == ICmpInst::ICMP_NE is clearer imo.


Is there a reason this isn't in InstructionSimplify?

goldstein.w.n added inline comments.Mar 20 2023, 9:37 AM

You could add a todo here for some simplifications i.e

define i1 @foo(i64 %x) {
  %alloc = alloca i64, i64 8
  %p = getelementptr i64, ptr %alloc, i64 %x
  %q = getelementptr i64, ptr %alloc, i64 3
  %cmp = icmp eq ptr %p, %q
  ret i1 %cmp

is really icmp eq i64 %x, 3.

nikic added inline comments.Mar 20 2023, 10:51 AM

getUnderlyingObject() does not look through phi nodes.


This can't be in InstSimplify because it is a correctness requirement that *all* comparisons of the alloca are replaced at the same time, something that InstSimplify cannot guarantee.

nikic updated this revision to Diff 506889.Mar 21 2023, 3:07 AM
nikic marked 5 inline comments as done.

Improve comments.

nikic updated this revision to Diff 506890.Mar 21 2023, 3:11 AM

Upload correct patch.

nikic added inline comments.Mar 21 2023, 3:12 AM

I don't think that would be legal, because it effectively retains a capture of the alloca, through which the address may be determined or constrained, leading to a contradiction with another fold.

goldstein.w.n added inline comments.Mar 26 2023, 4:08 PM

Why does something like:

target datalayout = "p:32:32"
@gp = global ptr null, align 8
declare ptr @hidden_inttoptr()
declare void @witness(i1, i1)
define void @src() {
  %m = alloca i8, i32 4
  %lgp = load ptr, ptr @gp, align 8
  %rhs2 = call ptr @hidden_inttoptr()
  %cmp1 = icmp eq ptr %m, %rhs2
  %cmp2 = icmp eq ptr %lgp, %m
  call void @witness(i1 %cmp1, i1 %cmp2)
  ret void

Work? Isn't %m on both sides? (Although it is fine to reduce this snippet), just dont understand what U->getOperandNo() is actually checking for.

nikic added inline comments.Mar 30 2023, 3:28 AM

What we care about is whether the alloca is on both sides of the same icmp. icmp based_on(%alloca), %other can be assumed to be false, icmp %other, based_on(%alloca) can be assumed to be false, but icmp based_on(%alloca), based_on(%alloca) may be true or false (but conversely, is not an escape, so we can ignore it entirely in this fold).

This revision is now accepted and ready to land.Apr 13 2023, 10:12 AM
This revision was landed with ongoing or failed builds.Apr 14 2023, 2:36 AM
This revision was automatically updated to reflect the committed changes.