This is an archive of the discontinued LLVM Phabricator instance.

[MC] Add 'G' to augmentation string for MTE instrumented functions
ClosedPublic

Authored by fmayer on Jun 3 2022, 2:52 PM.

Details

Summary

This was agreed on in
https://lists.llvm.org/pipermail/llvm-dev/2020-May/141345.html

The thread proposed two options

  • add a character to augmentation string and handle in libuwind
  • use a separate personality function.

It was determined that this is the simpler and better option.

This is part of ARM's Aarch64 ABI:
https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#id22

The next step after this is teaching libunwind to untag when this
augmentation character is set.

Diff Detail

Event Timeline

fmayer created this revision.Jun 3 2022, 2:52 PM
Herald added a project: Restricted Project. · View Herald Transcript
fmayer updated this revision to Diff 434156.Jun 3 2022, 3:10 PM

respect nounwind

fmayer updated this revision to Diff 434157.Jun 3 2022, 3:11 PM

comment

fmayer updated this revision to Diff 434159.Jun 3 2022, 3:19 PM

comment

fmayer edited the summary of this revision. (Show Details)Jun 3 2022, 3:21 PM
fmayer published this revision for review.Jun 3 2022, 3:23 PM
fmayer added a reviewer: eugenis.
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2022, 3:24 PM
MaskRay added a comment.EditedJun 3 2022, 4:00 PM

The lld/ELF change should be in a separate patch.
It's the convention and I am usually picky on how tests should be made.

This was agreed on in https://lists.llvm.org/pipermail/llvm-dev/2020-May/141345.html.

This agreement is insufficient for the broad Linux community.

Linux Standard Base Core Specification, Generic Part defines augmentation string characters. Essentially this needs some GCC agreement, otherwise there could be potential future conflict.

fmayer added a comment.Jun 3 2022, 4:04 PM

This was agreed on in https://lists.llvm.org/pipermail/llvm-dev/2020-May/141345.html.

This agreement is insufficient for the broad Linux community.

Linux Standard Base Core Specification, Generic Part defines augmentation string characters. Essentially this needs some GCC agreement, otherwise there could be potential future conflict.

Thanks! I was wondering about this. I will get broader agreement prior to moving forward with this.

fmayer updated this revision to Diff 434211.Jun 3 2022, 5:10 PM

split off lld / dwarf parser change.

Thanks for starting the GCC thread! I think the implementation is reasonable.

[llvm] Add 'G' to augmentation string for MTE instrumented functions.

[MC] may be more specific. The convention is to avoid trailing dot in the subject line.

This was agreed on in https://lists.llvm.org/pipermail/llvm-dev/2020-May/141345.html.

Avoid trailing . in an URI. Many tools consider . part of the URI.

Try summarizing the main point of the thread and place it in the commit message (summary).

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1294

There are too many blank lines.

Use

if (getFunctionCFISectionType(*MF) != CFISection::None)
  OutStreamer->emitCFIMTETaggedFrame();
llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
6532
if (parseEOL())
  return true;

I've updated other unexpected *** in ... directive to use the prevailing style expected newline.

llvm/test/CodeGen/AArch64/stack-tagging-cfi.ll
3 ↗(On Diff #434211)

i32* => ptr

7 ↗(On Diff #434211)
fmayer added a comment.Jun 6 2022, 9:44 AM

Thanks for starting the GCC thread! I think the implementation is reasonable.

Turns out this is already part of ARM's ABI:
https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#id22

fmayer updated this revision to Diff 434519.Jun 6 2022, 10:12 AM
fmayer marked 4 inline comments as done.

address comments

MaskRay accepted this revision.Jun 6 2022, 10:27 AM

Looks great! Hope that @eugenis can check the test interop with msan.

This revision is now accepted and ready to land.Jun 6 2022, 10:27 AM
fmayer retitled this revision from [llvm] Add 'G' to augmentation string for MTE instrumented functions. to [MC] Add 'G' to augmentation string for MTE instrumented functions.Jun 6 2022, 10:35 AM
fmayer edited the summary of this revision. (Show Details)

Should we have a test for the augmentation string, through llvm-readelf?

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1289

As we've seen recently, sometimes even nounwind frames get unwound. Is it possible to relax some of these conditions to emit the "G" in as many cases as possible? This is the situation where MTE is more demanding to the unwind correctness, so I'd rather err on the safe side.

fmayer added a comment.Jun 6 2022, 5:20 PM

Should we have a test for the augmentation string, through llvm-readelf?

We have that test the follow-up CL that does the change to lld (as requested to be split by MaskRay).

fmayer added inline comments.Jun 6 2022, 5:22 PM
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1289

I'm not sure. This is exactly the same logic as for the B-Key, so either we change both or neither. B-Key is just as important to correctness as this is.

What do you mean "nounwind frames get unwound"? AFAICT they do get skipped over, so the 'G' wouldn't help.

Should we have a test for the augmentation string, through llvm-readelf?

We have that test the follow-up CL that does the change to lld (as requested to be split by MaskRay).

Hmm ok. This tests does not really need to involve lld at all, but that's fine.

fmayer added a comment.EditedJun 6 2022, 5:28 PM

Should we have a test for the augmentation string, through llvm-readelf?

We have that test the follow-up CL that does the change to lld (as requested to be split by MaskRay).

Hmm ok. This tests does not really need to involve lld at all, but that's fine.

It does, because we need LLD so we can get an ELF file (which we then can readelf or, as the test does objdump).

You can llvm-objdump --dwarf=frames a .o file.

fmayer added a comment.Jun 6 2022, 5:42 PM

Should we have a test for the augmentation string, through llvm-readelf?

We have that test the follow-up CL that does the change to lld (as requested to be split by MaskRay).

Hmm ok. This tests does not really need to involve lld at all, but that's fine.

It does, because we need LLD so we can get an ELF file (which we then can readelf or, as the test does objdump).

Good point. The change to the DWARF parser for that is also in the follow up change though, but I changed the test to both check the .o and the .so files.

fmayer added inline comments.Jun 6 2022, 5:49 PM
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1289

I think the only thing we can do is force unwind tables for all frames that have stack instrumentation, but that would be quite a big hammer.

I'm not sure that is the right thing to do, where the real solution would be for people to fix their builds to not mix exception and noexception code in this way. Given that this mode is (at least for now) intended for testing, to me that seems like the better thing for people to do, as all sorts of stuff can go wrong when throwing exceptions through no-exception code (PAC as mentioned above, anything that needs RAII, &c).

It may be better to move the DebugInfo change from the lld patch into this patch.

fmayer added a comment.Jun 6 2022, 5:58 PM

It may be better to move the DebugInfo change from the lld patch into this patch.

I split it into D127171. Can also merge into here, no strong opinions

fmayer updated this revision to Diff 434922.Jun 7 2022, 1:11 PM

add another test

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1289

OK I think we have resolved this confusion. Compiling with -fno-exceptions does still generate IR with uwtables but also with`nounwind`. In this case the uwtable does still get generated, but not the cleanup actions. Added a test to that effect.

eugenis accepted this revision.Jun 7 2022, 1:22 PM

LGTM

This revision was landed with ongoing or failed builds.Jun 8 2022, 12:36 PM
This revision was automatically updated to reflect the committed changes.