Page MenuHomePhabricator

[InstSimplify] Don't fold comparisons of non-inbounds GEPs
Needs RevisionPublic

Authored by LemonBoy on Mar 1 2021, 1:14 AM.

Details

Summary

The logic employed by the constant folder assumes the GEPs to be inbounds, when applied to other GEPs the result can be potentially incorrect and cause miscompilations.

Diff Detail

Unit TestsFailed

TimeTest
40 msx64 debian > Flang.Semantics::resolve102.f90
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/flang/test/Semantics/test_errors.sh /mnt/disks/ssd0/agent/llvm-project/flang/test/Semantics/resolve102.f90 /mnt/disks/ssd0/agent/llvm-project/build/tools/flang/test/Semantics/Output/resolve102.f90.tmp /mnt/disks/ssd0/agent/llvm-project/build/bin/f18 -intrinsic-module-directory /mnt/disks/ssd0/agent/llvm-project/build/tools/flang/include/flang

Event Timeline

LemonBoy created this revision.Mar 1 2021, 1:14 AM
LemonBoy requested review of this revision.Mar 1 2021, 1:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2021, 1:14 AM
LemonBoy added inline comments.Mar 1 2021, 1:14 AM
llvm/lib/IR/ConstantFold.cpp
1837

Are non-inbounds GEP common enough to implement this?

lebedev.ri requested changes to this revision.EditedMar 1 2021, 1:16 AM
lebedev.ri added a subscriber: lebedev.ri.

This seems to be correct?
https://alive2.llvm.org/ce/z/JyimAB

This revision now requires changes to proceed.Mar 1 2021, 1:16 AM
nikic added a reviewer: aqjune.Mar 3 2021, 1:17 PM

@nlopes Can you please take a look at the alive result? I don't think the transform is correct, but alive says it is. The result of the GEP is a pointer of value zero. It has different provenance from the null pointer, but icmp eq should be a pure value comparison, independent of provenance, right?

llvm/lib/IR/ConstantFold.cpp
1837

I'd generally say that we should handle inbounds precisely just for the sake of clarity -- however, after reviewing all the folds below, it looks like all of them either require inbounds, or can only work on GEPs for which we would infer inbounds anyway (e.g. the gep GV1, 0 icmp gep GV2, 0 case). Given that, the early out here should be fine, and I think you can drop the TODO as well.

1854

Unrelated to this patch, but all the isSigned handling in this code looks wrong to me. This is effectively saying that GlobalValues can only appear in the lower half of the address space and not in the upper (negative) half, which is not true.

aqjune added a comment.Mar 3 2021, 5:33 PM

I don't have a clear model for the semantics of pointer comparison ATM; Pointer comparison *sometimes* needs to take provenance into consideration because LLVM folds p1 == p2 where p1 and p2 are pointing to two different zero-size objects having the same address into false.
Also, considering provenance into account allows aggressively folding pointer comparisons. It isn't clear how *frequently* the provenance should be considered.

But, I'm rather curious about how the miscompilation happened from this optimization. A gep with such offset isn't common, unless a programmer writes a code that subtracts a pointer from null (which is already fishy)?
It would be great if I can see the input that causes miscompilation.

aqjune added a comment.Mar 3 2021, 6:39 PM

I don't have a clear model for the semantics of pointer comparison ATM; Pointer comparison *sometimes* needs to take provenance into consideration because LLVM folds p1 == p2 where p1 and p2 are pointing to two different zero-size objects having the same address into false.
Also, considering provenance into account allows aggressively folding pointer comparisons. It isn't clear how *frequently* the provenance should be considered.

FWIW, these are a few examples and relevant links:

nikic added a comment.Mar 4 2021, 1:30 AM

I don't have a clear model for the semantics of pointer comparison ATM; Pointer comparison *sometimes* needs to take provenance into consideration because LLVM folds p1 == p2 where p1 and p2 are pointing to two different zero-size objects having the same address into false.
Also, considering provenance into account allows aggressively folding pointer comparisons. It isn't clear how *frequently* the provenance should be considered.

But, I'm rather curious about how the miscompilation happened from this optimization. A gep with such offset isn't common, unless a programmer writes a code that subtracts a pointer from null (which is already fishy)?
It would be great if I can see the input that causes miscompilation.

I'm not sure why zig generates this code, but the context here is https://github.com/ziglang/zig/issues/6408. The issue was exposed by D93820, as we previously simply folded away the GEP to null, losing provenance.

My understanding was that the current resolution on pointer comparison issues is that pointer comparison is value-only, and provenance-losing equality replacements in GVN are what we consider incorrect. But I haven't followed all the discussions.

The original case here was actually not "gep == null", but "ptrtoint gep == 0", in which case it's obviously a pure value comparison, but InstSimplify looks through that. Do you think that's the real issue?

But, I'm rather curious about how the miscompilation happened from this optimization. A gep with such offset isn't common, unless a programmer writes a code that subtracts a pointer from null (which is already fishy)?
It would be great if I can see the input that causes miscompilation.

As stated here the gep is being generated by the SCEV pass.
A minimal test case that shows the gep being created (but not this miscompilation) is here.

llvm/lib/IR/ConstantFold.cpp
1837

Some trivial geps can still be folded such as all-zero ones vs constant or unsigned comparisons with zero. But yes, I'll drop the TODO as those cases are pretty rare.

1854

Yes, that looks wrong. I guess this is unlikely to cause problems as most (every?) pointer comparisons against null are of eq/neq type.

But, I'm rather curious about how the miscompilation happened from this optimization. A gep with such offset isn't common, unless a programmer writes a code that subtracts a pointer from null (which is already fishy)?
It would be great if I can see the input that causes miscompilation.

As stated here the gep is being generated by the SCEV pass.
A minimal test case that shows the gep being created (but not this miscompilation) is here.

(i've already been looking at the scev problem, maybe will post patch soon-ish)

nikic resigned from this revision.Mar 4 2021, 1:59 AM

Resigning here, based on past experience trying to challenge the Word Of God. It really doesn't help that someone opened an alive issue for this without cross-referencing it, so the same discussion is repeated twice.

aqjune added a subscriber: nikic.Mar 4 2021, 3:37 AM

The original case here was actually not "gep == null", but "ptrtoint gep == 0", in which case it's obviously a pure value comparison, but InstSimplify looks through that. Do you think that's the real issue?

Actually this is indeed a problematic transformation; you said a very valid point.
Removing it has large impact (it leaves ptrtoint to an object which is conservatively considered to escape the object), so simply removing the transformation is hard ATM, but I believe it should be removed at some point.

It is bad that there were two discussions with the same topic; maybe the link was not shared here because the Alive2's side did not have a very clear solution about ptr comparison as well. If I could answer anything clear about the pointer comparison, the link must have been shared :(

(i've already been looking at the scev problem, maybe will post patch soon-ish)

+1, Fixing SCEV seems to be the easiest solution at this point.