This is an archive of the discontinued LLVM Phabricator instance.

[SROA] Propagate correct TBAA/TBAA Struct offsets
ClosedPublic

Authored by wsmoses on Feb 1 2021, 3:17 PM.

Details

Summary

SROA does not correctly account for offsets in TBAA/TBAA struct metadata.
This patch creates functionality for generating new MD with the corresponding
offset and updates SROA to use this functionality.

Resolves https://bugs.llvm.org/show_bug.cgi?id=47612

Diff Detail

Event Timeline

wsmoses created this revision.Feb 1 2021, 3:17 PM
wsmoses requested review of this revision.Feb 1 2021, 3:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2021, 3:17 PM
wsmoses edited the summary of this revision. (Show Details)Feb 1 2021, 3:18 PM
wsmoses added a subscriber: vchuravy.

We need a test case with nested structs as well, I somehow have the feeling that doesn't work properly rn.

Also some comments that seem straight forward. For the logic part, I hope someone else could take a look, I'd have to read up on struct TBAA.

llvm/include/llvm/IR/Metadata.h
645

Any reason to expose this. If not, maybe make them static members in the AAMDNode struct, potentially hidden.

llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
746

Style:
No braces.
Offset not off, same elsewhere.
MD (or similar) not M, same elsewhere.

749

you push back up to 4 elements, let's go with 4 here too.

785

shouldn't we break at the end?

llvm/lib/Transforms/Scalar/SROA.cpp
3388

No temporary GEPs please. What is wrong with GEP ?

3440

No temporary GEPs please. What is wrong with InBoundsGEP ?

wsmoses updated this revision to Diff 320661.Feb 1 2021, 7:02 PM
wsmoses marked 3 inline comments as done.

Address comments

Very much agreed on someone with more TBAA experience looking over.

I'm not sure that I have the same hunch about nested structs. We only need to modify the offset in the root node so I don't immediately see how nesting presents an issue. Mind elaborating on what you think could go wrong?

llvm/include/llvm/IR/Metadata.h
645

I can definitely see other users of such shifting, though if they always use AAMDNodes it will be encapsulated.

A public static method seems good to me.

llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
785

No, we want to extract all of the relevant TBAA.struct pieces. In essence TBAA.struct is a list of triples saying the offset, size, type. There therefore may be several relevant triples.

llvm/lib/Transforms/Scalar/SROA.cpp
3388

The problem is that the GEP instruction itself may be constant folded by the IRBuilder (and thus why it only returns a value and not a gep instruction).

3440

Same reply as above.

FYI this patch currently causes the TBAA in SROA basictest to fail because (at least from my understanding of TBAA) the TBAA on that memcpy is sized to one byte (meaning only the first part of the copy ends up with TBAA).

I want someone with more TBAA experience to check me on this assumption -- and likewise I assume one byte copy in the test should be changed to indicate the entire memory is of that type and thus the TBAA is propagated?

wsmoses marked 3 inline comments as done.Feb 1 2021, 7:09 PM
jdoerfert added inline comments.Feb 1 2021, 9:05 PM
llvm/lib/Transforms/Scalar/SROA.cpp
3388

Hm, do we need this for anything other than struct types?

wsmoses added inline comments.Feb 2 2021, 9:17 AM
llvm/lib/Transforms/Scalar/SROA.cpp
3388

When you say struct types do you mean tbaa.struct (used in memcpy), tbaa struct path information, or something else?

We definitely need this in tbaa.struct per the memcpy test case. I think we also want it for regular tbaa (e.g. you have a load which uses regular tbaa info and it should have its tbaa with the correct offset).

Likewise I can imagine optimization passes other than SROA needing to do this sort of offset procedure (hence my inclination to make the API public for the shifting).

llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
782

What if the offset is inside the specified region instead of at the boundary ? Shouldn't we keep that with a 0 offset and adapted size ? Although keeping it is not strictly necessary, but then we will loose some tbaa information.

nikic resigned from this revision.Feb 3 2021, 12:03 PM

(I don't know anything about TBAA)

wsmoses marked 2 inline comments as done.Feb 6 2021, 12:36 PM
wsmoses added inline comments.
llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
782

Good call, amended to include.

wsmoses updated this revision to Diff 321958.Feb 6 2021, 12:45 PM
wsmoses marked an inline comment as done.

Allow for partial resizes and update basictest.ll

wsmoses updated this revision to Diff 321959.Feb 6 2021, 12:48 PM

Actually include the update

wsmoses added a project: Restricted Project.Feb 10 2021, 3:18 PM
wsmoses updated this revision to Diff 323195.Feb 11 2021, 5:58 PM

Remove temporary GEP

wsmoses updated this revision to Diff 323197.Feb 11 2021, 6:00 PM

Merge commits

Looks reasonable to me.

jdoerfert accepted this revision.Feb 17 2021, 8:58 AM

All comments have been addressed and it looks certainly reasonable. Due to the lack of TBAA experts, let's go ahead: LGTM

This revision is now accepted and ready to land.Feb 17 2021, 8:58 AM
This revision was landed with ongoing or failed builds.Feb 17 2021, 8:59 AM
This revision was automatically updated to reflect the committed changes.

After this commit, this bug started to happen: https://bugs.llvm.org/show_bug.cgi?id=49825. Can you please take a look?