This is an archive of the discontinued LLVM Phabricator instance.

[Metadata] Introduce MD_pcsections
ClosedPublic

Authored by melver on Aug 1 2022, 1:53 AM.

Details

Summary

Introduces MD_pcsections metadata kind. See added documentation for
more details.

Subsequent patches enable propagating PC sections metadata through code
generation to the AsmPrinter.

RFC: https://discourse.llvm.org/t/rfc-pc-keyed-metadata-at-runtime/64191

Diff Detail

Event Timeline

melver created this revision.Aug 1 2022, 1:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2022, 1:53 AM
melver requested review of this revision.Aug 1 2022, 1:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2022, 1:53 AM
dvyukov accepted this revision.Aug 1 2022, 11:52 PM
dvyukov added inline comments.
llvm/docs/PCSectionsMetadata.rst
37–38

There seems to be something with wording:
"backend to emit the PC ... to be emitted to all named sections"
"backend to emit the PC ... to all named sections"?

This revision is now accepted and ready to land.Aug 1 2022, 11:52 PM
melver updated this revision to Diff 449217.Aug 2 2022, 1:29 AM
melver marked an inline comment as done.

Fix typos.

MaskRay added a subscriber: MaskRay.Aug 2 2022, 6:37 PM

This introduces new LangRef constructs. I suggest that you enlarge the reviewer list.

This link can be amended into the summary (commit message).

MaskRay added inline comments.Aug 2 2022, 8:10 PM
llvm/test/CodeGen/AArch64/pcsections.ll
1 ↗(On Diff #449217)

One RUN line for the default code model suffices. There is no codegen optimization difference.

A patch should not introduce disabled tests this way. If a dependent patch introduces the functionality, the test should be added to that patch.

MaskRay added inline comments.Aug 2 2022, 8:12 PM
llvm/docs/PCSectionsMetadata.rst
28

IIRC metadata is a legacy syntax (llvm 3.5 time). New syntax doesn't use metadata. The doc here should use the new syntax.

melver added a comment.Aug 3 2022, 8:57 AM

This introduces new LangRef constructs. I suggest that you enlarge the reviewer list.

This link can be amended into the summary (commit message).

I thought to wait for some comments on the RFC.
I'll add more reviewers after reshuffling the series.

melver marked an inline comment as done.Aug 3 2022, 9:02 AM
melver added inline comments.
llvm/test/CodeGen/AArch64/pcsections.ll
1 ↗(On Diff #449217)

I don't understand. Removing any one line here will reduce coverage.

We have RUNs for each -O0 to -O3 with the default code model, and then only O1 with a large code model.

melver marked an inline comment as done.Aug 3 2022, 9:07 AM
melver added inline comments.
llvm/test/CodeGen/AArch64/pcsections.ll
1 ↗(On Diff #449217)

Note that e.g. on arm64 -O0 uses GlobalISel and -O1 and above use FastISel+SelectionDAG, so using different optimization levels will test different ISels. Different optimization levels have helped me catch bugs in earlier implementations.

melver updated this revision to Diff 450022.Aug 4 2022, 9:55 AM
melver marked an inline comment as done.
melver edited the summary of this revision. (Show Details)

Address comments.

MaskRay added inline comments.Aug 4 2022, 10:42 AM
llvm/test/CodeGen/AArch64/pcsections.ll
1 ↗(On Diff #449217)

To exercise all of GlobalISel FastISel SelectionDAG, the best way is to use explicit options like -fast-isel=, -global-isel=, instead of relying on -O0/-O1/-O2/-O3. -O* runs a large set of codegen pipeline which is pretty irrelevant to this patch.

For a patch series, it's not suitable to add BROKEN: in patch a and change them to RUN: in patch b. This should be refactored to just add RUN: in patch b. If you feel that showding the diff is important, change a to add RUN: about the current state and change b to show the difference.

vitalybuka requested changes to this revision.Aug 4 2022, 10:50 AM
vitalybuka added inline comments.
llvm/lib/IR/MDBuilder.cpp
162

There is TEST_F(MDBuilderTest, which can be updated

166

is implicit mapping from Sections[i] to AuxData[i] in interface is neccecary?
can you do SomeMap<StringRef, <ArrayRef<Constant *>> ?

That would be cleaner

174

why do you need nullptr in that array?

This revision now requires changes to proceed.Aug 4 2022, 10:50 AM
melver marked 3 inline comments as done.Aug 4 2022, 2:20 PM
melver added inline comments.
llvm/lib/IR/MDBuilder.cpp
166

Map doesn't have stable iteration order. So I'm trying ArrayRef<std::pair<StringRef, SmallVector<Constant*>>>. The problem is that the inner array can't be an ArrayRef, otherwise users of the function have to either convert to something that can convert to ArrayRef<std::pair<StringRef, ArrayRef>> ensuring that whatever inner ref they pass lives long enough.

The reason of the earlier interface was to avoid forcing SmallVector or any particular vector onto callers by sticking to ArrayRef. But by wrapping it into a std::pair, the ArrayRef conversion can't be done implicitly anymore, so usability suffers.

This means that in SanitizerBinaryMetadadata, I'm now using a SmallVector<StringRef, SmallVector<Constant*>>> (instead of just a SmallVector<StringRef>) where the inner SmallVector will always be 0-sized, which wastes space but didn't before. Perhaps this was premature optimization.

174

I don't think we do.

llvm/test/CodeGen/AArch64/pcsections.ll
1 ↗(On Diff #449217)

In the latest diff of this patch the tests are gone and I moved them to the AsmPrinter patch (now later in the series).

I will additionally add tests to explicitly exercise each ISel there, but want to keep -O* for the additional coverage (whether or not there are irrelevant optimizations being run today, in future this may change).

vitalybuka added inline comments.Aug 4 2022, 8:14 PM
llvm/docs/PCSectionsMetadata.rst
30

can aux-consts be just an aggregate?

so you have more structured format: section, array, section, array, section, array,....

llvm/lib/IR/MDBuilder.cpp
166

Map doesn't have stable iteration order.

Do you need a particular section order?

pair<stringref, SmallVector> is not nice, but better than obvious mapping.

maybe real builder?

PCSectionsBuilder B =  MDB.getPCSectionsBuilder();
F.setMetadata(
   B.addSection(S, {c1})
      .addSection(S, {}).
      .addSection(S, {c1, c2})
      .buid());
vitalybuka added inline comments.Aug 4 2022, 8:17 PM
llvm/lib/IR/MDBuilder.cpp
166

Map doesn't have stable iteration order.

Do you need a particular section order?

pair<stringref, SmallVector> is not nice, but better than non-obvious mapping.

maybe real builder?

PCSectionsBuilder B =  MDB.getPCSectionsBuilder();
F.setMetadata(
   B.addSection(S, {c1})
      .addSection(S, {}).
      .addSection(S, {c1, c2})
      .build());
melver updated this revision to Diff 450296.Aug 5 2022, 8:17 AM
melver marked 7 inline comments as done.
  • Change MDNode format to take aux constants as another MDNode.
  • Change createPCSections interface.
  • Add MDBuilderTest test case.
melver added inline comments.Aug 5 2022, 8:24 AM
llvm/docs/PCSectionsMetadata.rst
30

Yes it can - it's a trade-off. Making it another MDTuple makes interpreting and constructing the MDNode more complex. But it might save space if the same auxiliary data is reused for different sections.

On a whole, placing the auxiliary data into a separate tuple seems nicer, so I've changed the format. However, I feel there's no point in artificially restricting only 1 tuple after a section string, so things like `!0 = !{"sec1", !1, !2, "sec2", ..}` will be allowed (distinguishing MDString and MDNode within the outer tuple is unambiguous).

llvm/lib/IR/MDBuilder.cpp
166

We need order for deterministic compilation and tests. :-/

The main downside of doing pair<StringRef, SmallVector> is a little bit of wasted space. A builder won't help with that and for now just seems more complex. The easiest seems to be to just define MDBuilder::PCSection which users can use as entry type for any type of vector or array.

vitalybuka accepted this revision.Aug 5 2022, 10:58 AM

LGTM. Thanks!

This revision is now accepted and ready to land.Aug 5 2022, 10:58 AM
melver updated this revision to Diff 453012.Aug 16 2022, 7:54 AM

Update documentation with "Guarantees on Code Generation".

melver added a reviewer: rnk.Aug 16 2022, 7:55 AM
melver updated this revision to Diff 453013.Aug 16 2022, 7:58 AM

Fix typo.

melver added a comment.Sep 5 2022, 8:58 AM

Hi all - if there are no more major comments, we intend to land the entire patch series mid-week or early next week.
Thanks!

Hi all - if there are no more major comments, we intend to land the entire patch series mid-week or early next week.
Thanks!

I highly recommend do not lend them at once, and have delay in a few hours between patches.
Give a chance to bots maintainers to revert just a single patch when something goes wrong.

Hi all - if there are no more major comments, we intend to land the entire patch series mid-week or early next week.
Thanks!

I highly recommend do not lend them at once, and have delay in a few hours between patches.
Give a chance to bots maintainers to revert just a single patch when something goes wrong.

Thanks, will do.

This revision was automatically updated to reflect the committed changes.