This is an archive of the discontinued LLVM Phabricator instance.

[MemoryBuiltins] Determine the size of a global w/o initializer
Needs ReviewPublic

Authored by jdoerfert on Jan 29 2020, 8:35 PM.

Details

Summary
NOTE: This is a patch to help start a discussion in response to http://lists.llvm.org/pipermail/llvm-dev/2020-January/138725.html

The idea is that a initializer is not necessary if we know the
definition is exact and it is not an empty object.

Diff Detail

Event Timeline

jdoerfert created this revision.Jan 29 2020, 8:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 29 2020, 8:35 PM

Unit tests: pass. 62318 tests passed, 0 failed and 838 were skipped.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Thanks for this!

It's been a while since I've worked with memory-ish things in LLVM. Please forgive my questions if they're silly :)

This approach looks good to me, with the caveat that I'm not an expert in all of the creative, "supported" tricks that exist for lying to the compiler about object sizes.

Please also add a test for a zero-sized and nonzero-sized global (llvm.objectsize might be a good hammer for that?). Otherwise, I'm happy when Hal is

llvm/test/Analysis/BasicAA/bug.23540.ll
11

Can you expand a bit on why PartialAlias is the correct answer here, please?

I'd naively expect this to be either MustAlias (both pointers are to a 32 bit object, and there're only 32 bits in this allocation; derefing anything else is UB), or MayAlias (there're only 32 bits here, but maaaaybe you're pointing to ((int*)&c + 1), and that's OK.)

jdoerfert marked an inline comment as done.Jan 30 2020, 9:22 PM

Please also add a test for a zero-sized and nonzero-sized global (llvm.objectsize might be a good hammer for that?).

So there are existing test with 0 sized objects that would fail if we do not have the Size != 0 check but I'll add a few more to nail down what we expect as soon as I find the time. It might take a week or two though.

llvm/test/Analysis/BasicAA/bug.23540.ll
11

My reasoning is that arrayidx6 points right after c and arrayidx points somewhere based on c. There are no accesses so the alias query is done with an unspecified size on the access part which should mean the two accesses can very well overlap.

george.burgess.iv marked an inline comment as done.Feb 2 2020, 2:57 PM

So there are existing test with 0 sized objects that would fail if we do not have the Size != 0 check but I'll add a few more to nail down what we expect as soon as I find the time. It might take a week or two though.

SGTM -- thanks!

llvm/test/Analysis/BasicAA/bug.23540.ll
11

I don't quite agree with the unspecified size part: if the access size were entirely unknown, we could only correctly return PartialAlias in cases with provably identical object offsets, since PartialAlias *guarantees* the two accesses will overlap somehow.

Experimentally, we report NoAlias for (%b, %c) in:

define void @f() {
  %a = alloca i32
  %b = bitcast i32* %a to i8*
  %c = getelementptr i8, i8* %b, i64 1
  ret void
}

So I think the sizes here should be 4 bytes. If that's the case, MustAlias seems like the optimal answer, which means that PartialAlias is also correct. sizeof(access) == sizeof(underlying_object) in both cases means there's really no room for nonzero object offsets. :)

If this sounds correct, please update the newly-added comment.

jdoerfert marked an inline comment as done.Feb 2 2020, 7:51 PM
jdoerfert added inline comments.
llvm/test/Analysis/BasicAA/bug.23540.ll
11

So I added a d = gep %b, undef which is closer to the test case that changed and which results in a MayAlias: https://godbolt.org/z/KT4F5z

I don't follow the MustAlias argument. There is no access here, so we talk about pointers only.
The pointers are based on the same thing with an unknown offset from each other (at least that seems to me how undef is interpreted here).

I'll come up with more tests asap.

george.burgess.iv marked an inline comment as done.Feb 2 2020, 10:56 PM
george.burgess.iv added inline comments.
llvm/test/Analysis/BasicAA/bug.23540.ll
11

There is no access here, so we talk about pointers only.

Correct; I was just interpreting

There are no accesses so the alias query is done with an unspecified size on the access part

as "the alias queries get handed LocationSizes that don't hasValue()." If the sizes were unspecified, we'd get MayAlias instead of NoAlias for the (%b, %c) query on your godbolt link.

The MustAlias argument is based on this, but is honestly more of a tangent than anything. Referring again to the godbolt link, the only derefable (in a well-defined way) i32* we can form from %a must point directly to %a. So if we have two i32*s based on %a, ISTM inferring MustAlias would be valid, since MustAlias doesn't imply pointer equality.

In any case, we both now agree that PartialAlias is OK here, so I'm happy to call this resolved unless you wanna dive further into this :)