Page MenuHomePhabricator

[IR] Allow multiple instruction metadata attachments with same type
Needs ReviewPublic

Authored by tejohnson on Jun 30 2021, 5:19 PM.

Details

Reviewers
davidxl
snehasish
Summary

This is modeled on D20414 which added this support to global objects,
but not instructions. Note D67626 later refactored these interfaces
into the Value base class, but Instruction does not use this directly
because of special handling needed for the DbgLoc metadata.

I'm adding the support to instructions because it may be needed for
profile guided heap optimization
(https://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html)
depending on how that profile data is represented in the IR, as a
single allocation call may contain multiple heap profiles for
different contexts, and because it is an inconsistency in IR support
(the LLVM langref does not specify anything with regards to this
behavior).

I've added a generic Assembler test change similar to what was done
in D20414, but also augmented a branch-weight test to validate the
change made to BitcodeReader.cpp.

Diff Detail

Event Timeline

tejohnson created this revision.Jun 30 2021, 5:19 PM
tejohnson requested review of this revision.Jun 30 2021, 5:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2021, 5:19 PM

I'm a bit concerned about the subtle difference in behaviour... in the past, metadata schemes that needed multiple attachments added a level of indirection (point at a tuple of metadata). Will that work for the heap profiler?

because it is an inconsistency in IR support
(the LLVM langref does not specify anything with regards to this
behavior).

While the current behaviour (one attachment per kind) is not documented in LangRef, it's documented in the header docs, and I do think it was intentional; and in any case, it has been around for a long time and there could be code patterns relying on it.

Since this changes IR semantics, I suggest adding a change to LangRef and a discussion on llvm-dev (and/or the new proposals system).

A few design points that I'm not sure about:

  • What should getMetadata do if there are multiple attachments for the requested kind? (assert? return nullptr? return the first, last, or arbitrary?)
  • Should there be a way for metadata kinds to explicitly opt-in (or out) of allowing multiple attachments? (Maybe could be stored on the LLVMContext somehow?)
    • If not, what do we do about the inconsistency with !dbg? Support multiple attachments somehow?
  • (And: why is this better than using a level of indirection when multiple attachments are needed?)

I also have a bunch of comments inline assuming this goes forward.

llvm/include/llvm/IR/Instruction.h
287–293

What happens if there are multiple? Assert, return first, return arbitrary, or return nullptr?

329

Might be good to mention that if KindID already exists, this breaks future calls to getMetadata().

llvm/lib/IR/Metadata.cpp
1358–1361

Looks like !dbg doesn't support multiple instances. Should that be documented? Should this assert instead of silently overriding the existing one? Should the LLParser catch this as an error, or potentially use setMetadata() still for !dbg?

llvm/lib/Transforms/Utils/ValueMapper.cpp
924–926
  • I think this drops handling of mapMetadata() returning nullptr (which can happen if the ValueMapper client has mapped some metadata node explicitly). Is that intentional?
  • I wonder if there should be a setAllMetadata() API matching getAllMetadata() that combines the clear and the add loop (but without all the separate densemap lookups). This callsite would still need a loop in the middle to call mapMetadata().
llvm/test/Bitcode/branch-weight.ll
19

Seems a shame to modify a binary file. Could you instead add a separate test that round-trips to bitcode?

20–23

I think it'd be good for this to use string substitution blocks rather than relying on stable numbering: https://www.llvm.org/docs/CommandGuide/FileCheck.html#filecheck-string-substitution-blocks

I'm a bit concerned about the subtle difference in behaviour... in the past, metadata schemes that needed multiple attachments added a level of indirection (point at a tuple of metadata). Will that work for the heap profiler?

because it is an inconsistency in IR support
(the LLVM langref does not specify anything with regards to this
behavior).

While the current behaviour (one attachment per kind) is not documented in LangRef, it's documented in the header docs, and I do think it was intentional; and in any case, it has been around for a long time and there could be code patterns relying on it.

I think Teresa referred to the inconsistency between Global Object and Instruction. Is that intentional?

Since this changes IR semantics, I suggest adding a change to LangRef and a discussion on llvm-dev (and/or the new proposals system).

A few design points that I'm not sure about:

  • What should getMetadata do if there are multiple attachments for the requested kind? (assert? return nullptr? return the first, last, or arbitrary?)
  • Should there be a way for metadata kinds to explicitly opt-in (or out) of allowing multiple attachments? (Maybe could be stored on the LLVMContext somehow?)
    • If not, what do we do about the inconsistency with !dbg? Support multiple attachments somehow?
  • (And: why is this better than using a level of indirection when multiple attachments are needed?)

I also have a bunch of comments inline assuming this goes forward.

I think Teresa referred to the inconsistency between Global Object and Instruction. Is that intentional?

Thanks for clarifying!

I did some archival to confirm: yes, this was intentional. The change for globals landed in: https://reviews.llvm.org/D20414

IR: Allow multiple global metadata attachments with the same type.

(although it missed documenting the difference in LangRef, certainly it was specifically changing globals without changing instructions)

I disagree with the assertion in the commit message:

This will be necessary to allow the global merge pass to attach
multiple debug info metadata nodes to global variables once we reverse
the edge from DIGlobalVariable to GlobalVariable.

(since it wasn't necessary... multiple attachments could have been handled by adding indirection, as is typical for instruction metadata)
but the context/impact was different since the commit landed around the same time that global metadata attachments were introduced (https://reviews.llvm.org/D20074). This wasn't a subtle behaviour change that could affect existing code handling metadata attachments for global variables, since there was no such code (yet).

(The inconsistency between instructions / globals is certainly unfortunate... just want to make sure that if we add this feature for instructions the impact is carefully considered...)

Hi @dexonsmith, thanks for your comments! Responses below.

I think Teresa referred to the inconsistency between Global Object and Instruction. Is that intentional?

Thanks for clarifying!

I did some archival to confirm: yes, this was intentional. The change for globals landed in: https://reviews.llvm.org/D20414

Right, my patch description points to that patch. =)

IR: Allow multiple global metadata attachments with the same type.

(although it missed documenting the difference in LangRef, certainly it was specifically changing globals without changing instructions)

It was intentional with that change, but AFAICT that was based on need at the time and at least in the patch there is no pointer to a broader discussion on llvm-dev about changing the langref or as to why this fundamentally should only be allowed for global objects and not instructions. I am not sure why there should be a difference in functionality?

Note also that while this was initially added for debug info messages on variables, the support is now used for other types of global variable metadata attachments. For example !type metadata - vtables have one per type.

I disagree with the assertion in the commit message:

This will be necessary to allow the global merge pass to attach
multiple debug info metadata nodes to global variables once we reverse
the edge from DIGlobalVariable to GlobalVariable.

(since it wasn't necessary... multiple attachments could have been handled by adding indirection, as is typical for instruction metadata)
but the context/impact was different since the commit landed around the same time that global metadata attachments were introduced (https://reviews.llvm.org/D20074). This wasn't a subtle behaviour change that could affect existing code handling metadata attachments for global variables, since there was no such code (yet).

I'm not sure why this is more of a subtle behavior change for instructions than it was for global objects. The main difference for instructions would be the DbgLoc attachments, since those are handled differently, but I think that could be addressed through some assertions as you note in your comments below, and possibly improved over time if needed.

(The inconsistency between instructions / globals is certainly unfortunate... just want to make sure that if we add this feature for instructions the impact is carefully considered...)

I'm a bit concerned about the subtle difference in behaviour... in the past, metadata schemes that needed multiple attachments added a level of indirection (point at a tuple of metadata). Will that work for the heap profiler?

As noted above, global variables use multiple attachments for !type metadata, in addition to the original need for !dbg attachments.

For profile data, since currently all profile data uses a !prof attachment, but different MDString "tags", it seems cleaner IMO to use a different !prof attachment for different types of profile data. For instrumentation based PGO and PGHO (profile guided heap optimization), for now I think we would have the PGO "branch_weights" metadata and the PGHO TBD metadata on different types of instructions (branches vs calls), but if we combine SamplePGO with PGHO we will have both types on the same type of instructions (SamplePGO puts "branch_weight" metadata on calls). While we technically could support by changing all !prof to have a level of indirection, it seems cleaner IMO to allow different !prof attachments for the different types of profile data.

because it is an inconsistency in IR support
(the LLVM langref does not specify anything with regards to this
behavior).

While the current behaviour (one attachment per kind) is not documented in LangRef, it's documented in the header docs, and I do think it was intentional; and in any case, it has been around for a long time and there could be code patterns relying on it.

Since this changes IR semantics, I suggest adding a change to LangRef and a discussion on llvm-dev (and/or the new proposals system).

As noted above, this actually fixes a discrepancy between GlobalObject and Instruction handling. I am not sure why that discrepancy is desirable, and the LangRef doesn't mention either way and doesn't seem to distinguish between attachments to global objects vs instructions. If it is documented in the header docs beyond the places I changed please let me know and I can update that in this patch too.

A few design points that I'm not sure about:

  • What should getMetadata do if there are multiple attachments for the requested kind? (assert? return nullptr? return the first, last, or arbitrary?)

The getMetadata here relies on the underlying Value::getMetadata support. I copied the comments from there for the most part (at least for the header interfaces). Looking back, the original implementation in D20414 added an assert if there was more than one attachment of the requested type (for the getMetadata returing a single attachment). Sometime between that patch and when this was refactored into Value::getMetadata it appears that support disappeared. It looks like now the first is blindly returned. This should probably be fixed in the underlying Value::getMetadata call. I'll need to go back and look at the changes in between to see where/why this changed.

  • Should there be a way for metadata kinds to explicitly opt-in (or out) of allowing multiple attachments? (Maybe could be stored on the LLVMContext somehow?)

Not sure - should that be done for GlobalObjects as well? It isn't the case right now.

  • If not, what do we do about the inconsistency with !dbg? Support multiple attachments somehow?

That is a good question. I am not sure of the history of why !dbg on instructions is handled differently. Should that be changed to just use the same underlying Value metadata support?

  • (And: why is this better than using a level of indirection when multiple attachments are needed?)

I described some rationale above as to why I think it is cleaner, especially for something generic like !prof that specifies the profile data in the MDString tag first operand. And also for consistency with global objects, where as noted the support for multiple attachments of the same type has now been used for additional metadata attachment kinds.

I also have a bunch of comments inline assuming this goes forward.

llvm/include/llvm/IR/Instruction.h
287–293

I addressed this in my comments above too, but to reiterate, this relies on the behavior of the underlying Value::getMetadata support, and that support just seems to return the first matching one, which is a regression in the behavior from the original support added in D20414, which asserted if there was more than one. I need to go back and see where/why that assert was removed before it was refactored into the Value class in D67626. Since Instruction will use the underlying Value support, it should be fixed there.

329

Will do.

llvm/lib/IR/Metadata.cpp
1358–1361

Yeah I responded to this above too - do you know the history or rationale for having separate support for !dbg attachments on instructions? It should be documented somewhere, or fixed. The assert is a good idea, will add. And yes a hard error in LLParser is a good idea as well.

llvm/lib/Transforms/Utils/ValueMapper.cpp
924–926

Ah the change to cast from cast_or_null was unintentional. I copied this from the code elsewhere in this file that does the exact same thing for GlobalObject md attachments.

The setAllMetadata interface sounds good, I would need to add in Value as well and can use in this file for GlobalObjects.

llvm/test/Bitcode/branch-weight.ll
19

Sure. Yeah, not sure why this was done as a binary file in the first place. I can llvm-as it fine with --disable-verify, which is how I generated the new binary file.

20–23

Will do.

tejohnson added inline comments.Jul 1 2021, 11:34 AM
llvm/include/llvm/IR/Instruction.h
287–293

It turns out the assert was removed in D38184. Looks like this was to move handling of malformed debug info to AutoUpgrade. Per patch:
"Prior to this patch, a NoAsserts build would have printed a
warning and stripped the debug info and an Asserts build would have
asserted, after this patch the Verifier will report a fatal error."

I'm assuming the assert it refers to is the one removed in Metadata.cpp in that patch. Perhaps the assert should be left it but restricted to non-dbg metadata?

I disagree with the assertion in the commit message:

This will be necessary to allow the global merge pass to attach
multiple debug info metadata nodes to global variables once we reverse
the edge from DIGlobalVariable to GlobalVariable.

(since it wasn't necessary... multiple attachments could have been handled by adding indirection, as is typical for instruction metadata)
but the context/impact was different since the commit landed around the same time that global metadata attachments were introduced (https://reviews.llvm.org/D20074). This wasn't a subtle behaviour change that could affect existing code handling metadata attachments for global variables, since there was no such code (yet).

I'm not sure why this is more of a subtle behavior change for instructions than it was for global objects. The main difference for instructions would be the DbgLoc attachments, since those are handled differently, but I think that could be addressed through some assertions as you note in your comments below, and possibly improved over time if needed.

Here's why I think it's a subtle change for instructions. If getMetadata() previously couldn't fail (returned nullptr or value), but now can assert or return a non-deterministic value, I fear it would make existing working code (say in optimizations that look at loop metadata) in subtle ways... but the bug would be latent, waiting for other code to start adding multiple attachments of the same kind.

For global objects, it wasn't a behaviour change at all. Global objects have used a multimap all along (give or take a single day). There's no existing code that suddenly became buggy. There was no change in the API, so it cannot have been subtle.

As noted above, this actually fixes a discrepancy between GlobalObject and Instruction handling. I am not sure why that discrepancy is desirable, and the LangRef doesn't mention either way and doesn't seem to distinguish between attachments to global objects vs instructions.

I remain convinced that this is a change to LLVM IR semantics. Because of that, I assert that it should be reviewed by a wider audience.

(IMO, unifying behaviour here is a good idea, but I strongly disagree with framing that suggests it's fixing an accidental quirk, and I think it needs to be done carefully. It seems to be a changing long-standing (and intended) behaviour.)

A few design points that I'm not sure about:

  • What should getMetadata do if there are multiple attachments for the requested kind? (assert? return nullptr? return the first, last, or arbitrary?)

The getMetadata here relies on the underlying Value::getMetadata support.

The existing support is for global objects. They started with an assertion that was then relaxed to something else.

The header docs currently says "fails", which sounds to me like a crash, which sounds like a scary and subtle behaviour change.

IMO, the best choices would be "first" or "last"; I think either one would avoid creating the sorts of subtle behaviour changes in existing code that I'm concerned about. Between them:

  • "first" seems more natural / intuitive to me, not sure why.
  • "last" would mean getMetadata() gives the same answer for textual IR that already listed two attachments of the same kind (previously, the last one would overwrite the first). Maybe it'd be nice to keep textual IR slightly more stable like this, not sure.

WDYT? (Seems like something that could be discussed on LLVM dev though...)

  • Should there be a way for metadata kinds to explicitly opt-in (or out) of allowing multiple attachments? (Maybe could be stored on the LLVMContext somehow?)

Not sure - should that be done for GlobalObjects as well? It isn't the case right now.

I think !dbg has a verifier check that guarantees only one attachment per global object. Probably makes sense to add verifier checks for most existing kinds for instructions (at least !dbg, maybe also tbaa, loop, maybe most others?). One option is to add the verifier check for all kinds, and document in LangRef that instructions by default only support one attachment per kind. Then when !prof code is updated to correctly handle multiple kinds, it can opt out of the verifier check (and update LangRef to say it supports multiple). (@aprantl, any thoughts?)

  • If not, what do we do about the inconsistency with !dbg? Support multiple attachments somehow?

That is a good question. I am not sure of the history of why !dbg on instructions is handled differently. Should that be changed to just use the same underlying Value metadata support?

Yeah, maybe it'd be better to unify !dbg as a prep patch, before landing this. Note that it's an important compile-time optimization, whose history I've explained inline below (I also have a concrete idea for how to unify it without losing the optimization).

  • (And: why is this better than using a level of indirection when multiple attachments are needed?)

I described some rationale above as to why I think it is cleaner, especially for something generic like !prof that specifies the profile data in the MDString tag first operand. And also for consistency with global objects, where as noted the support for multiple attachments of the same type has now been used for additional metadata attachment kinds.

SGTM, I suggest including that in the post proposing the change to LLVM IR.

llvm/lib/IR/Metadata.cpp
1358–1361

I know a bit about the history.

When debug info is on, it puts metadata on every instruction. The booking for metadata makes this very expensive, creating an attachment map for each instruction. Storing it directly on the instruction is cheaper.

This design dates from when Metadata inherited from Value and needed a TrackingVH<MDNode> here (which is bigger than a pointer I think?). DebugLoc stored 64-bits of data that could be used to look up the actual MDNode in the LLVMContext.

I bet there are other good options now for getting this storage optimization.

For example, something like this might be simpler and a bit more general:


class Instruction  /* or whatever Value wants this optimization */ {
  // ...


  /// Metadata attachments for `Instruction`.
  ///
  /// If the instruction has a single attachment and its MDKind<4 (fits
  /// in 2 bits), this points directly at the attachment and the int
  /// stores the kind. Otherwise (more than 1 attachment, or MDKind>=4),
  /// this points at the metadata table. Similar to a SmallPtrSet.
  PointerIntPair<PointerUnion<MDNode *, MDAttachments *>, 2> Attachments;
};

(And the MDAttachments could be skipped from the LLVMContextImpl, where it currently is.)

The same optimization probably makes sense for GlobalObjects - when debug info is on, almost all of them have a !dbg attachment.

Anyway, the reason I suggest this alternative storage scheme is that it gives us the !dbg optimization without having to treat !dbg specially. As a bonus, !prof would benefit from the same optimization.

(One subtle complication is that during IR construction, metadata might be "unresolved". When a single MDNode assigned to Attachments is "unresolved", we'd need to somehow hook this into the TrackingMDRef side-table for Metadata's usually-off-but-sometimes-on RAUW support.)

I disagree with the assertion in the commit message:

This will be necessary to allow the global merge pass to attach
multiple debug info metadata nodes to global variables once we reverse
the edge from DIGlobalVariable to GlobalVariable.

(since it wasn't necessary... multiple attachments could have been handled by adding indirection, as is typical for instruction metadata)
but the context/impact was different since the commit landed around the same time that global metadata attachments were introduced (https://reviews.llvm.org/D20074). This wasn't a subtle behaviour change that could affect existing code handling metadata attachments for global variables, since there was no such code (yet).

I'm not sure why this is more of a subtle behavior change for instructions than it was for global objects. The main difference for instructions would be the DbgLoc attachments, since those are handled differently, but I think that could be addressed through some assertions as you note in your comments below, and possibly improved over time if needed.

Here's why I think it's a subtle change for instructions. If getMetadata() previously couldn't fail (returned nullptr or value), but now can assert or return a non-deterministic value, I fear it would make existing working code (say in optimizations that look at loop metadata) in subtle ways... but the bug would be latent, waiting for other code to start adding multiple attachments of the same kind.

Can you clarify how this change would affect currently working code, since we don't currently support >1 attachment? The only thing I can think of is llvm assembly code that incorrectly has >1 metadata attachment on an instruction. In that case the compiler currently silently will keep the last one. As you note further below we could simulate that behavior for getMetadata instead of asserting by returning the last one. We could also add in the verification check you suggested to add an allowlist of attachment types that can have >1 attachment and clean up any tests that fire, since they presumably are working accidentally.

For global objects, it wasn't a behaviour change at all. Global objects have used a multimap all along (give or take a single day). There's no existing code that suddenly became buggy. There was no change in the API, so it cannot have been subtle.

While they may have supported >1 attachment internally, due to the lack of a getMetadata with multiple MDs returned or addMetadata before D20414 I'm not sure how it would have in practice supported >1 attachment of the same type?

As noted above, this actually fixes a discrepancy between GlobalObject and Instruction handling. I am not sure why that discrepancy is desirable, and the LangRef doesn't mention either way and doesn't seem to distinguish between attachments to global objects vs instructions.

I remain convinced that this is a change to LLVM IR semantics. Because of that, I assert that it should be reviewed by a wider audience.

I can send out an RFC then.

(IMO, unifying behaviour here is a good idea, but I strongly disagree with framing that suggests it's fixing an accidental quirk, and I think it needs to be done carefully. It seems to be a changing long-standing (and intended) behaviour.)

A few design points that I'm not sure about:

  • What should getMetadata do if there are multiple attachments for the requested kind? (assert? return nullptr? return the first, last, or arbitrary?)

The getMetadata here relies on the underlying Value::getMetadata support.

The existing support is for global objects. They started with an assertion that was then relaxed to something else.

The header docs currently says "fails", which sounds to me like a crash, which sounds like a scary and subtle behaviour change.

IMO, the best choices would be "first" or "last"; I think either one would avoid creating the sorts of subtle behaviour changes in existing code that I'm concerned about. Between them:

  • "first" seems more natural / intuitive to me, not sure why.
  • "last" would mean getMetadata() gives the same answer for textual IR that already listed two attachments of the same kind (previously, the last one would overwrite the first). Maybe it'd be nice to keep textual IR slightly more stable like this, not sure.

WDYT? (Seems like something that could be discussed on LLVM dev though...)

Ok. Right, the LLParser will silently drop all but the last if you try to use multiple on an instruction with the same type. But the Value implementation returns the first afaict.

  • Should there be a way for metadata kinds to explicitly opt-in (or out) of allowing multiple attachments? (Maybe could be stored on the LLVMContext somehow?)

Not sure - should that be done for GlobalObjects as well? It isn't the case right now.

I think !dbg has a verifier check that guarantees only one attachment per global object. Probably makes sense to add verifier checks for most existing kinds for instructions (at least !dbg, maybe also tbaa, loop, maybe most others?). One option is to add the verifier check for all kinds, and document in LangRef that instructions by default only support one attachment per kind. Then when !prof code is updated to correctly handle multiple kinds, it can opt out of the verifier check (and update LangRef to say it supports multiple). (@aprantl, any thoughts?)

  • If not, what do we do about the inconsistency with !dbg? Support multiple attachments somehow?

That is a good question. I am not sure of the history of why !dbg on instructions is handled differently. Should that be changed to just use the same underlying Value metadata support?

Yeah, maybe it'd be better to unify !dbg as a prep patch, before landing this. Note that it's an important compile-time optimization, whose history I've explained inline below (I also have a concrete idea for how to unify it without losing the optimization).

Thanks for the background and the suggestions below. I think changing that should be orthogonal to adding this general support to instructions though, especially if we go with the allowlist so !dbg only allows one. Or is there another reason we would need to change that first?

  • (And: why is this better than using a level of indirection when multiple attachments are needed?)

I described some rationale above as to why I think it is cleaner, especially for something generic like !prof that specifies the profile data in the MDString tag first operand. And also for consistency with global objects, where as noted the support for multiple attachments of the same type has now been used for additional metadata attachment kinds.

SGTM, I suggest including that in the post proposing the change to LLVM IR.

Ack.

Can you clarify how this change would affect currently working code, since we don't currently support >1 attachment?

With the right design, it probably won't... and we can get the right verifier checks in place to ensure newly-expressible-but-invalid IR doesn't slip through. (Note that the design implied by the header docs, where getMetadata() "fails", an existing IR consumer that calls getMetadata() has a latent bug since it's going to start crashing if/when an IR producer starts adding multiple attachments. The bug would not be exposed immediately, only once some producer starts using the feature.)

A couple more points I just thought of for !prof (which is the kind that would support multiple attachments):

  • We probably need verifier that there aren't two competing (e.g.) !"branch_weights" entries on the same instruction
  • We probably need an API for "setting" the !"branch_weights" entry (since blindly calling addMetadata() would make the above verifier fail)

Also: I thought of another alternative approach to the problem you're trying to solve with !prof: move some of the tags to a different/new MDKind. (And if you're introducing a new tag for this feature, you don't even have to update testcases or do a bitcode upgrade, just add another kind to the enum...)

Ok. Right, the LLParser will silently drop all but the last if you try to use multiple on an instruction with the same type. But the Value implementation returns the first afaict.

IMO the first is a better answer; if no one brings up concrete concerns with that it seems fine to me.

Thanks for the background and the suggestions below. I think changing that should be orthogonal to adding this general support to instructions though, especially if we go with the allowlist so !dbg only allows one. Or is there another reason we would need to change that first?

Probably not a hard requirement; just seemed cleaner to me to remove the special case for !dbg than to add more of them.

Can you clarify how this change would affect currently working code, since we don't currently support >1 attachment?

With the right design, it probably won't... and we can get the right verifier checks in place to ensure newly-expressible-but-invalid IR doesn't slip through. (Note that the design implied by the header docs, where getMetadata() "fails", an existing IR consumer that calls getMetadata() has a latent bug since it's going to start crashing if/when an IR producer starts adding multiple attachments. The bug would not be exposed immediately, only once some producer starts using the feature.)

A couple more points I just thought of for !prof (which is the kind that would support multiple attachments):

  • We probably need verifier that there aren't two competing (e.g.) !"branch_weights" entries on the same instruction

Ditto for value profile !prof metadata ("VP" tag).

  • We probably need an API for "setting" the !"branch_weights" entry (since blindly calling addMetadata() would make the above verifier fail)

Not sure I follow - right now there shouldn't be multiple branch weights prof metadata and I don't see a need for that going forward (the heap profiler data would use a different/new tag for its !prof metadata).

Also: I thought of another alternative approach to the problem you're trying to solve with !prof: move some of the tags to a different/new MDKind. (And if you're introducing a new tag for this feature, you don't even have to update testcases or do a bitcode upgrade, just add another kind to the enum...)

Do you mean add something new like !heapprof? I thought about that, but it seemed cleaner to put all profile data under !prof. But it would solve part of the issue (needing multiple !prof due to other types of profile metadata on the inst), and avoid the need for this change if we also use a level of indirection when there is more than one heap profile per allocation.

Ok. Right, the LLParser will silently drop all but the last if you try to use multiple on an instruction with the same type. But the Value implementation returns the first afaict.

IMO the first is a better answer; if no one brings up concrete concerns with that it seems fine to me.

Thanks for the background and the suggestions below. I think changing that should be orthogonal to adding this general support to instructions though, especially if we go with the allowlist so !dbg only allows one. Or is there another reason we would need to change that first?

Probably not a hard requirement; just seemed cleaner to me to remove the special case for !dbg than to add more of them.

  • We probably need an API for "setting" the !"branch_weights" entry (since blindly calling addMetadata() would make the above verifier fail)

Not sure I follow - right now there shouldn't be multiple branch weights prof metadata and I don't see a need for that going forward (the heap profiler data would use a different/new tag for its !prof metadata).

Agreed.

Were you intending to substitute all existing calls to setMetadata() with calls to addMetadata()? I wrote the above bullet after noticing that doing so would be incorrect when updating metadata (e.g., fixing up branch weights after a transformation). Instead of replacing the tag you'll end up with two of them (rendering the IR invalid). My suggestion is to add a helper for the logic needed to update metadata safely.

Walking through my logic in more detail:

This patch (the one we're looking at) will give Instruction the capability of having multiple attachments of the same kind. It'll have a verifier check that disallows it in practice (since it'll initially be invalid IR for all metadata kinds), but then specific metadata kinds can opt-in as needed (by opting out of the verifier check, and adding new more specific checks as is useful).

A future patch will leverage this feature for !prof attachments. The suggested new rules for the !prof attachment treat it as a set, where nodes are uniqued by their first operand. But it's not a multiset; only one node is allowed for each identity tag. (Alternatively, you could see this as a map, where the data is the node and the key is its first operand.)

In some cases, code will have a priori knowledge that an instruction does not yet have a given tag attached. In those cases, it'll be safe to call addMetadata() since it won't invalidate the set.

In other cases, code will need logic something like this for the replace operation:

auto replaceProfAttachment = [](const Instruction &I, MDNode &NewN) {
  SmallVector<MDNode *> MDs;
  I.getMetadata(MD_prof, MDs);
  for (int N = 0, NE = MDs.size(); N != NE; ++N) {
    if (MDs[N]->getOperand(0) != NewN.getOperand(0))
      continue;

    // Found a node with the same tag to replace.
    MDs[N] = &NewN;
    I.setMetadata(MD_prof, MDs);
    return;
  }
  // Tag not found. Safe to append.
  I.addMetadata(MD_prof, &NewN);
};

Maybe an API like this could be generalized?

I.addOrReplaceMetadataWithTaggedOperand(MD_prof, NewN, /*TaggedOperandNo=*/0);

Scanning through the code, many existing calls to setMetadata() with MD_prof are on new instructions (the former case where addMetadata is safe). But there are also a few that look like updates to me (the latter case where it isn't).

I also noticed calls to setMetadata(MD_prof, nullptr) -- not sure if you'd want those updated to delete just the branch weights (which would need a set-erase operation), or if it always would make sense to clear all !prof attachments together.

Do you mean add something new like !heapprof? I thought about that, but it seemed cleaner to put all profile data under !prof. But it would solve part of the issue (needing multiple !prof due to other types of profile metadata on the inst), and avoid the need for this change if we also use a level of indirection when there is more than one heap profile per allocation.

Yes, exactly.

If you end up going that way (and abandoning this patch), probably one of us should update LangRef to better document the existing behaviour (maybe the commit message can reference this review for a few design points to consider if someone else needs the feature).

Also in that case, I'm curious if you think it'd make sense to split out !valueprof at some point to make it orthogonal from branch weights? (Or maybe it doesn't matter, since they'll never/rarely be on the same instructions?)

  • We probably need an API for "setting" the !"branch_weights" entry (since blindly calling addMetadata() would make the above verifier fail)

Not sure I follow - right now there shouldn't be multiple branch weights prof metadata and I don't see a need for that going forward (the heap profiler data would use a different/new tag for its !prof metadata).

Agreed.

Were you intending to substitute all existing calls to setMetadata() with calls to addMetadata()? I wrote the above bullet after noticing that doing so would be incorrect when updating metadata (e.g., fixing up branch weights after a transformation). Instead of replacing the tag you'll end up with two of them (rendering the IR invalid). My suggestion is to add a helper for the logic needed to update metadata safely.

No I initially didn't intend to update these to addMetadata, but that is a hole that will be exposed once we use heap profile in combination with sample PGO where we may end up with both the new heap prof metadata and a branch_weights metadata on a call, so that is a good point that we need to think through that now if we go with the approach in this patch.

Walking through my logic in more detail:

This patch (the one we're looking at) will give Instruction the capability of having multiple attachments of the same kind. It'll have a verifier check that disallows it in practice (since it'll initially be invalid IR for all metadata kinds), but then specific metadata kinds can opt-in as needed (by opting out of the verifier check, and adding new more specific checks as is useful).

A future patch will leverage this feature for !prof attachments. The suggested new rules for the !prof attachment treat it as a set, where nodes are uniqued by their first operand. But it's not a multiset; only one node is allowed for each identity tag. (Alternatively, you could see this as a map, where the data is the node and the key is its first operand.)

There are 2 motivations for this patch from my side:

  1. My primary motivation: allow multiple new heap prof metadatas on a single allocation call without indirection, via multiple !prof attachments each with the same tag. E.g.
call malloc !prof !0, !prof !1

!0 = !{"heap", [stack trace 1 heap profile data]}
!1 = !{"heap", [stack trace 2 heap profile data]}

This could alternatively be addressed via indirection, possibly with an optimization to inline it if there is only one. I.e.

call malloc !prof !0

; if a single stack trace
!0 = !{"heap", [stack trace 1 heap profile data]} 

; if >1 stack traces
!0 = !{"heap", !{1, !2}}
!1 = !{[stack trace 1 heap profile data]}
!2 = !{[stack trace 2 heap profile data]}
  1. When we combine heap profile data with sample PGO we may end up with both heap prof and branch_weights prof metadata on a single call. This could alternatively be addressed by using a different md kind altogether, e.g. !heapprof.

What you describe below will not work if we want to attach multiple "heap" !prof attachments. But I'm leaning towards the alternate described above where we use indirection if there is >1 which would simplify things. Otherwise we need additional special casing which is not ideal.

In some cases, code will have a priori knowledge that an instruction does not yet have a given tag attached. In those cases, it'll be safe to call addMetadata() since it won't invalidate the set.

In other cases, code will need logic something like this for the replace operation:

auto replaceProfAttachment = [](const Instruction &I, MDNode &NewN) {
  SmallVector<MDNode *> MDs;
  I.getMetadata(MD_prof, MDs);
  for (int N = 0, NE = MDs.size(); N != NE; ++N) {
    if (MDs[N]->getOperand(0) != NewN.getOperand(0))
      continue;

    // Found a node with the same tag to replace.
    MDs[N] = &NewN;
    I.setMetadata(MD_prof, MDs);
    return;
  }
  // Tag not found. Safe to append.
  I.addMetadata(MD_prof, &NewN);
};

Maybe an API like this could be generalized?

I.addOrReplaceMetadataWithTaggedOperand(MD_prof, NewN, /*TaggedOperandNo=*/0);

Scanning through the code, many existing calls to setMetadata() with MD_prof are on new instructions (the former case where addMetadata is safe). But there are also a few that look like updates to me (the latter case where it isn't).

I also noticed calls to setMetadata(MD_prof, nullptr) -- not sure if you'd want those updated to delete just the branch weights (which would need a set-erase operation), or if it always would make sense to clear all !prof attachments together.

These are very good questions and thanks for the detailed suggestions above. I initially was ignoring these calls, since I was focused on motivation #1 and not thinking as much about the case where we would be combining this with sample PGO (since that is a longer term goal) and have !prof with different tags on the same instruction, but it is better to address these now if we go with allowing multiple !prof attachments with different tags to avoid surprises later. These existing calls to setMetadata to overwrite or clear the branch_weights md do need to be updated along the lines of what you suggest above (i.e. with a new interface to do it by tag) to address that future need.

Do you mean add something new like !heapprof? I thought about that, but it seemed cleaner to put all profile data under !prof. But it would solve part of the issue (needing multiple !prof due to other types of profile metadata on the inst), and avoid the need for this change if we also use a level of indirection when there is more than one heap profile per allocation.

Yes, exactly.

If you end up going that way (and abandoning this patch), probably one of us should update LangRef to better document the existing behaviour (maybe the commit message can reference this review for a few design points to consider if someone else needs the feature).

Agreed. Let me discuss this with my collaborators and think about it a little more. But yes, in either case let's plan to update LangRef. I was surprised that it didn't mention anything in this regard. Will circle back probably after the holiday.

Also in that case, I'm curious if you think it'd make sense to split out !valueprof at some point to make it orthogonal from branch weights? (Or maybe it doesn't matter, since they'll never/rarely be on the same instructions?)

Currently they are on different instructions than branch_weight metadata (otherwise I guess we would have hit this issue previously). I'm not sure if that will ever change, but if it does then I guess we could migrate to the same solution we end up going to for heap profiling.

  1. My primary motivation: allow multiple new heap prof metadatas on a single allocation call without indirection, via multiple !prof attachments each with the same tag. E.g.
call malloc !prof !0, !prof !1

!0 = !{"heap", [stack trace 1 heap profile data]}
!1 = !{"heap", [stack trace 2 heap profile data]}

Ah, I'd completely missed this, I was just thinking about the orthogonal metadata case.

Seems like a separate !heapprof attachment that uses the feature from this patch (leaving !prof alone, limited by the verifier to 1-per-instruction) could be pretty clean, but as you say, there are a number of reasonable options.

Agreed. Let me discuss this with my collaborators and think about it a little more. But yes, in either case let's plan to update LangRef. I was surprised that it didn't mention anything in this regard. Will circle back probably after the holiday.

Sounds great!

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
5526–5528

Just noticed this: if this patch does move forward, I'd suggest deferring this change for a later patch and making this one just about the IR feature. Maybe it could be tested with a kind that doesn't have a fixed id, like !attachment?

Allowing multiple md_prof in general and the set, update vs add problem @dexonsmith pointed out reminds me of the bug D69736 was fixing. So I can see that concern too.

Not trying to muddy the discussion as the comments below doesn't directly help the situation with allowing multiple md_prof. But ...

  1. I'm wondering if we should consider design the profile metadata in a way that context info can be (optionally) extended to all profile info too. E.g. branch_weights can have optional context info too, and it's backward compatible in the sense that omitting context info naturally falls back to today's context-less profile. And heap profile just uses the same metadata representation/structure except that it always have the optional context info filled. Saying that because with CSSPGO, we do have context for all branch weights stuff, though currently all such context will be lost if corresponding inlining doesn't happen when loading the profile. If we do that, we'd need to make sure there's no overhead when the optional context isn't present though.
  1. Without the general profile context support above (which may not worth the complexity if the overhead can't be zero), the heap profile metadata feels closer to "VP" rather than "branch_weights" - "VP" tag consists of a list of value/count pair, which could be generalize into key-value pairs that covers both today's VP and heap profile (key being context, and value being the heap profile). This is related to the comment on "split out !valueprof at some point to make it orthogonal from branch weights"..
  1. My primary motivation: allow multiple new heap prof metadatas on a single allocation call without indirection, via multiple !prof attachments each with the same tag. E.g.
call malloc !prof !0, !prof !1

!0 = !{"heap", [stack trace 1 heap profile data]}
!1 = !{"heap", [stack trace 2 heap profile data]}

Ah, I'd completely missed this, I was just thinking about the orthogonal metadata case.

Seems like a separate !heapprof attachment that uses the feature from this patch (leaving !prof alone, limited by the verifier to 1-per-instruction) could be pretty clean, but as you say, there are a number of reasonable options.

Agreed. Let me discuss this with my collaborators and think about it a little more. But yes, in either case let's plan to update LangRef. I was surprised that it didn't mention anything in this regard. Will circle back probably after the holiday.

Sounds great!

I think we will go in the direction of using new metadata attachment kinds, with indirection, to avoid this issue completely. I can write up and send you a langref patch, unless you think this warrants an llvm-dev discussion first?

Allowing multiple md_prof in general and the set, update vs add problem @dexonsmith pointed out reminds me of the bug D69736 was fixing. So I can see that concern too.

Not trying to muddy the discussion as the comments below doesn't directly help the situation with allowing multiple md_prof. But ...

  1. I'm wondering if we should consider design the profile metadata in a way that context info can be (optionally) extended to all profile info too. E.g. branch_weights can have optional context info too, and it's backward compatible in the sense that omitting context info naturally falls back to today's context-less profile. And heap profile just uses the same metadata representation/structure except that it always have the optional context info filled. Saying that because with CSSPGO, we do have context for all branch weights stuff, though currently all such context will be lost if corresponding inlining doesn't happen when loading the profile. If we do that, we'd need to make sure there's no overhead when the optional context isn't present though.

This is a very interesting idea. We can consider using a metadata format for the context portion of the profile that can be eventually shared with other types of profile attachments. BTW we will be sending an RFC soon-ish for the binary profile and metadata formats.

  1. Without the general profile context support above (which may not worth the complexity if the overhead can't be zero), the heap profile metadata feels closer to "VP" rather than "branch_weights" - "VP" tag consists of a list of value/count pair, which could be generalize into key-value pairs that covers both today's VP and heap profile (key being context, and value being the heap profile). This is related to the comment on "split out !valueprof at some point to make it orthogonal from branch weights"..

In some ways, although the VP profile is quite a bit simpler IMO (just a list of ints with hardcoded meaning, whereas the heap profile will have a variety of profiling statistics) and probably deserves to stay separate from the heap profiling. But yes, it is possible it should be separated from the branch weights.

I think we will go in the direction of using new metadata attachment kinds, with indirection, to avoid this issue completely. I can write up and send you a langref patch, unless you think this warrants an llvm-dev discussion first?

Since this is just clarifying the existing support/behaviour, I'm not sure it needs an llvm-dev discussion, but up to you.

I think we will go in the direction of using new metadata attachment kinds, with indirection, to avoid this issue completely. I can write up and send you a langref patch, unless you think this warrants an llvm-dev discussion first?

Since this is just clarifying the existing support/behaviour, I'm not sure it needs an llvm-dev discussion, but up to you.

Just mailed you D106304 for review to add this clarification to the langref.