This is an archive of the discontinued LLVM Phabricator instance.

[Assignment Tracking][14/*] Account for assignment tracking in instcombine
ClosedPublic

Authored by Orlando on Sep 5 2022, 7:58 AM.

Details

Summary

Most of the updates here are just to ensure DIAssignID attachments are maintained and propagated correctly.

Diff Detail

Event Timeline

Orlando created this revision.Sep 5 2022, 7:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 5 2022, 7:58 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
Orlando requested review of this revision.Sep 5 2022, 7:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 5 2022, 7:58 AM
jmorse added a subscriber: jmorse.Sep 8 2022, 10:01 AM

Some questions, but looks good, and good to have test coverage for all of this.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
299–308

Is there a risk that, if multiple stores have the same assign ID (which I think is possible?) that other dbg.assign intrinsics down other code paths will have their Values updated?

This might not actually be an issue, because if there's a call to memset (this context) then those parts won't have been promoted to Values.

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
167–168

Note that D133118 fiddles with this area, as the call below to replaceInstUsesWith doesn't update debug users. With D133118 landed, I think this code is fine.

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
4166–4169

Colour me delighted.

llvm/test/DebugInfo/Generic/assignment-tracking/instcombine/alloca-bitcast.ll
22

This then leads to the dbg.assign referring to a cast (tmpcast?), not to the alloca itself, which seems wrong. Possibly related to what I mentioned above.

(non-edit: Or, is it OK?)

llvm/test/DebugInfo/Generic/assignment-tracking/instcombine/sink-store.ll
29–30

Non-review: steamingly good that we can represent effective use-before-store / def's like this.

llvm/test/DebugInfo/Generic/assignment-tracking/instcombine/sink.ll
80–81

If I'm reading the CHECK lines right, it's the definition of %b that sinks, not the store or dbg.assign. That's not quite what the test comment says it covers, sinking of an instruction with a DIAssignID?

llvm/test/DebugInfo/Generic/assignment-tracking/instcombine/storemerge.ll
84

NEXT

Orlando updated this revision to Diff 459730.Sep 13 2022, 6:14 AM
Orlando marked 5 inline comments as done.

+ Address comments
+ Remove tbaa metadata from tests
+ Reduce test memset.ll size by regenerating the IR after swapping std::memset for __builtin_memset

jmorse requested changes to this revision.Sep 14 2022, 8:03 AM

All round good; I've marked one of the questions that I really do have a bit of concern with -- is that a scenario that can actually happen? AFAIUI:

  • Store instructions can be duplicated / unrolled as needed by llvm,
  • They'll initially have the same DIAssignID,
  • The marked code RAUWs the DIAssignID users, which could affect multiple dbg.assigns duplicated with the stores.

The answer to this may well be "no that doesn't happen", analysis appreciated though. Even if it does happen, I imagine that such scenarios are immediately optimised into not existing.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
299–308

^

This revision now requires changes to proceed.Sep 14 2022, 8:03 AM

Sorry - I didn't submit my inline responses (requires an extra button click after updating the patch, and I forgot this time).

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
299–308

This might not actually be an issue, because if there's a call to memset (this context) then those parts won't have been promoted to Values.

This matches my feelings too. Except! I tried to get a reproducer for your question in D133294 and actually stumbled into a case where this is... interesting. There's a dbg.assign that is linked to a memset in this situation has an undef that gets converted to a non-undef value here.

In my reproducer, the undef is added as a result of the old behaviour in D133315, which has since changed. But it is conceivable that an undef could've been introduced by different circumstances.

It feels unnatural for a debug intrinsic to ever leave the "undef" state, but in this case it's definitely not wrong. I don't know if that changes your feelings at all, but thought it was worth mentioning.

pre-submit EDIT:
Your concerns (mainline comments) sound possible to me - I would prefer to guard this, I'll update it.

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
167–168

I'll double-check it looks right when rebasing before landing? I think it'd be okay either way (see related reply in test alloca-bitcast.ll).

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
4166–4169

(this of course relies on the use-before-def being preserved, which probably is a bit hopeful)

llvm/test/DebugInfo/Generic/assignment-tracking/instcombine/alloca-bitcast.ll
22

I think it's okay either way. We still have the link to the alloca through the DIAssignID and all assignment tracking code that looks at addresses to find base-storage will be stripping casts etc. In addition (I mention in the TODO list in the docs) I'm think that the address and address-expression are possibly redundant.

llvm/test/DebugInfo/Generic/assignment-tracking/instcombine/sink.ll
80–81

You are right but I think the comment does cover this. The "instruction used by a dbg.assign" is %b = getelementptr ... (used-by, rather than linked-to).

Orlando planned changes to this revision.Sep 14 2022, 8:09 AM
jmorse added inline comments.Sep 15 2022, 3:17 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
299–308

I think recovering 'undef' back to a real Value is on-theme for assignment tracking -- after all, a primary outcome of all of this is that we fall back to using the stack location to identify the variables value, instead of an SSA Value. I suppose a failure mode would be where a store is rewritten to store a different value to what the original assignment "meant": imagine if we had:

store %foo, %ptr, !DIAssignID !1
store %bar, %ptr, !DIAssignID !2

And we deleted the second store but mutated the first to store %bar -- then we would restore connected dbg.assigns to the wrong value, if they became undef. But that would be erroneous behaviour anyway (to mutate stores like that), right? Otherwise, it doesn't matter if we restore Values to undef dbg.assigns at arbitary points in the function, so long as the store continues to be a source of truth about the original assignment.

I'd suggest adding a test for the undef-to-not-undef behaviour for referencing, might be worth putting it in the discourse post too.

llvm/test/DebugInfo/Generic/assignment-tracking/instcombine/sink.ll
80–81

Woot, I misread it.

Orlando updated this revision to Diff 460379.Sep 15 2022, 5:35 AM

+ Guard memset -> store value update

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
299–308

Interesting question. I'm not sure if such a transformation exists (as in "I don't know", not "I think not"). In that scenario you are right that the value would become wrong there. The whole "web of shared DIAssignID after unrolling" thing could also be a problem here. It would be nice if we didn't need that, but IIRC we decided it was necessary & I'm not particularly keen on re-evaluating that at this moment.

I've gone with the safest option here (if (uses old value) use new value), is that okay with you?

FWIW the test memset.ll in this patch was testing the undef -> known value behaviour, though I've now updated it.

A typo and a couple of nitish suggestions.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
298–299
299–307
llvm/test/DebugInfo/Generic/assignment-tracking/instcombine/memset.ll
16
jmorse accepted this revision.Nov 17 2022, 3:03 AM

LGTM

This revision is now accepted and ready to land.Nov 17 2022, 3:03 AM
Orlando marked an inline comment as not done.Nov 17 2022, 7:48 AM
Orlando added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
167–168

The test that exercises this particular modification, alloca-bitcast.ll, doesn't work when updated to use opaque pointers because the bitcast is noop and gets removed. I spent a little time seeing if the typed pointer codepath could be removed, but unfortunately a bunch of tests that haven't migrated to opaque pointers yet hit it.

https://llvm.org/docs/OpaquePointers.html#version-support claims that in LLVM 15.0 "Typed pointers are still available, but only supported on a best-effort basis and may be untested", so I feel that deleting the test (rather than adding a typed pointer test) is reasonable. Especially given it's a one-liner and quite "obvious". Are you okay with that @jmorse?

jmorse added inline comments.Nov 17 2022, 7:53 AM
llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
167–168

Hmmmmmmm. It feels bad to lose coverage of something we have coverage of -- if what you say is true, that the bitcast is deleted as a noop, then presumably someone someday is going to delete this function right? (PromoteCastOfAllocation). Can we have the test operate in typed-pointer mode for the forseeable future, with a comment indicating that someone can delete it when they delete this function?

This revision was landed with ongoing or failed builds.Nov 18 2022, 1:29 AM
This revision was automatically updated to reflect the committed changes.
Orlando marked 4 inline comments as done.

store-new-type.ll had to be reworked to work with opaque pointers - this area is a bit fiddly so I just copied an existing non-debug test for the code path and added some metadata by hand.