This is an archive of the discontinued LLVM Phabricator instance.

ThinLTO: Fix inline assembly references to static functions with CFI
ClosedPublic

Authored by samitolvanen on Jun 10 2021, 1:44 PM.

Diff Detail

Event Timeline

samitolvanen created this revision.Jun 10 2021, 1:44 PM
samitolvanen requested review of this revision.Jun 10 2021, 1:44 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 10 2021, 1:44 PM
samitolvanen added inline comments.
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
93

Note that I'm adding the alias to llvm.compiler.used because it's otherwise removed during optimization. Is there are better way to accomplish this?

pcc accepted this revision.Jun 17 2021, 9:16 AM

LGTM

clang/test/CodeGen/thinlto-cfi-icall-static-inline-asm.c
3 ↗(On Diff #351253)

Can the test be moved to llvm/test/Transforms/ThinLTOBitcodeWriter and made to use opt -thinlto-bc?

llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
93

Not as far as I know, that's what I'd recommend.

This revision is now accepted and ready to land.Jun 17 2021, 9:16 AM

Moved the test to llvm/test/Transforms/ThinLTOBitcodeWriter.

thakis added a subscriber: thakis.Jun 22 2021, 12:06 PM

Looks like this breaks check-llvm on mac, see http://45.33.8.238/mac/32814/summary.html

Please take a look and revert for now if it takes a while to fix.

samitolvanen reopened this revision.Jun 22 2021, 2:34 PM
This revision is now accepted and ready to land.Jun 22 2021, 2:34 PM

Fix a use-of-uninitialized-value error.

llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
71–93

Can you avoid making a copy of the OldName by doing the appendToCompilerUsed BEFORE making the dangling reference via ExportGV.setName(NewName);?

samitolvanen added inline comments.Jun 23 2021, 8:24 AM
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
71–93

No, I have to rename the existing function before I can create an alias with the same name, and as ExportGV.setName() invalidates Name, I need to create a copy first.

nickdesaulniers accepted this revision.Jun 23 2021, 10:03 AM
This revision was landed with ongoing or failed builds.Jun 23 2021, 11:10 AM
This revision was automatically updated to reflect the committed changes.
zequanwu added a subscriber: zequanwu.EditedJun 23 2021, 7:17 PM

Hi, this caused compiler crash: "Assertion `materialized_use_empty() && "Uses remain when a value is destroyed!"'" on chromium build https://ci.chromium.org/ui/p/chromium/builders/try/linux-official/151/overview.
Here is the error message:

While deleting: void ()* %
Use still stuck around after Def is destroyed:i8* bitcast (void ()* <badref> to i8*)
While deleting: void ()* %
Use still stuck around after Def is destroyed:i8* bitcast (voidld.lld: /usr/local/google/home/zequanwu/llvm-project/llvm/lib/IR/Value.cpp:103: llvm::Value::~Value(): Assertion `materialized_use_empty() && "Uses remain when a value is destroyed!"' failed.
 ()* <badref> to i8*)

Thanks for the revert, I'll take a look.

samitolvanen reopened this revision.Jul 13 2021, 11:19 AM
This revision is now accepted and ready to land.Jul 13 2021, 11:19 AM

Moved the alias creation to module level inline assembly to avoid issues with LowerTypeTestsModule, based on pcc's suggestion.

Change LGTM, but I don't understand why the following tests are modified:

  • llvm/test/ThinLTO/X86/devirt2.ll
  • llvm/test/Transforms/ThinLTOBitcodeWriter/split-internal2.ll
  • llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll

Change LGTM, but I don't understand why the following tests are modified:

  • llvm/test/ThinLTO/X86/devirt2.ll

This is needed to fix two missing symbol resolution errors that are caused by the aliases we added.

  • llvm/test/Transforms/ThinLTOBitcodeWriter/split-internal2.ll
  • llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll

And for these, we need to specify a target triple to use module inline assembly. According to pcc, there shouldn't be a real-world situation where the triple is missing, but these two tests don't currently specify one.

Change LGTM, but I don't understand why the following tests are modified:

  • llvm/test/ThinLTO/X86/devirt2.ll

This is needed to fix two missing symbol resolution errors that are caused by the aliases we added.

I'm curious if this will lead to breakages with LTO in general? I suppose not, since it's llvm-lto2 that needs the explicit list of symbols that can be linked against.

  • llvm/test/Transforms/ThinLTOBitcodeWriter/split-internal2.ll
  • llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll

And for these, we need to specify a target triple to use module inline assembly. According to pcc, there shouldn't be a real-world situation where the triple is missing, but these two tests don't currently specify one.

Neither of these tests use module inline assembly though, AFAICT?

samitolvanen added a comment.EditedJul 16 2021, 10:59 AM

Change LGTM, but I don't understand why the following tests are modified:

  • llvm/test/ThinLTO/X86/devirt2.ll

This is needed to fix two missing symbol resolution errors that are caused by the aliases we added.

I'm curious if this will lead to breakages with LTO in general? I suppose not, since it's llvm-lto2 that needs the explicit list of symbols that can be linked against.

No, this won't break LTO.

  • llvm/test/Transforms/ThinLTOBitcodeWriter/split-internal2.ll
  • llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll

And for these, we need to specify a target triple to use module inline assembly. According to pcc, there shouldn't be a real-world situation where the triple is missing, but these two tests don't currently specify one.

Neither of these tests use module inline assembly though, AFAICT?

The tests don't, but we add module inline assembly to create the aliases, which means that IR with type metadata will need a target triple for ThinLTO.

nickdesaulniers accepted this revision.Jul 16 2021, 11:08 AM
pcc accepted this revision.Jul 16 2021, 11:15 AM

LGTM

This revision was landed with ongoing or failed builds.Jul 16 2021, 2:34 PM
This revision was automatically updated to reflect the committed changes.
samitolvanen reopened this revision.Jul 16 2021, 2:55 PM

The tests that now specify a target triple also need ; REQUIRES: x86-registered-target or they will obviously fail. I'll upload another revision.

This revision is now accepted and ready to land.Jul 16 2021, 2:55 PM

Added REQUIRES: x86-registered-target to tests.

nickdesaulniers accepted this revision.Jul 16 2021, 3:33 PM

Sorry, I always forget when these are necessary.

This revision was landed with ongoing or failed builds.Jul 20 2021, 10:30 AM
This revision was automatically updated to reflect the committed changes.

This patch broke the sanitizer-windows bot: https://lab.llvm.org/buildbot/#/builders/127/builds/14257

Failed Tests (4):
  cfi-devirt-lld-thinlto-x86_64 :: anon-namespace.cpp
  cfi-devirt-lld-thinlto-x86_64 :: simple-pass.cpp
  cfi-standalone-lld-thinlto-x86_64 :: anon-namespace.cpp
  cfi-standalone-lld-thinlto-x86_64 :: simple-pass.cpp

Please revert or fix.

This patch broke the sanitizer-windows bot: https://lab.llvm.org/buildbot/#/builders/127/builds/14257

Failed Tests (4):
  cfi-devirt-lld-thinlto-x86_64 :: anon-namespace.cpp
  cfi-devirt-lld-thinlto-x86_64 :: simple-pass.cpp
  cfi-standalone-lld-thinlto-x86_64 :: anon-namespace.cpp
  cfi-standalone-lld-thinlto-x86_64 :: simple-pass.cpp

Please revert or fix.

This should fix the -msvc targets: https://reviews.llvm.org/D106392

samitolvanen reopened this revision.Jul 20 2021, 2:07 PM
This revision is now accepted and ready to land.Jul 20 2021, 2:07 PM
nickdesaulniers added inline comments.
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
90

Is there more information about "promotion aliases with x86_64-pc-windows-msvc" from D106392? Can we quote these only for msvc target triples? Can we add a comment about the quoting be necessary for those targets?

It's still not clear to me how tests using explicit -linux-gnu triples could fail on -mscv hosts.

llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
90

Also, I think the quoting hurts the readability of the generated asm. Maybe that doesn't matter for LTO, but I'd be curious if we could do such escaping only when necessary? Perhaps that's only when targeting -msvc triples?

rnk added inline comments.Jul 20 2021, 2:37 PM
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
90

I was going to say, the proper way to do this is what MCSymbol::print does:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/MC/MCSymbol.cpp#L59

That code doesn't seem exhaustive, but at least it escapes quotes. We can't call MC from here due to library layering.

The LLVM IR readability is poor, but inline asm in IR is already hard to read. I wouldn't worry about that, I'd mainly worry about the output of -S.

Quoting always seems fine to me, I guess.

samitolvanen added inline comments.Jul 20 2021, 3:41 PM
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
90

Is there more information about "promotion aliases with x86_64-pc-windows-msvc" from D106392?

Yes, the -msvc targets use Visual C++ compatible name mangling, which requires quotes when referred to in inline assembly. Here's a trivial reproducer:

$ cat test.cpp 
static void a(void) {}
void* b(void) { return (void *)a; }
$ clang -flto=thin -fvisibility=default -fsanitize=cfi -c test.cpp
$ clang -flto=thin -fvisibility=default -fsanitize=cfi -target x86_64-pc-windows-msvc -c test.cpp
Either SourceMgr should be available
UNREACHABLE executed at llvm-project/llvm/lib/MC/MCContext.cpp:913!
...

Here's the inline assembly generated when we compile the above example for the -msvc target:

.set ?a@@YAXXZ,?a@@YAXXZ.d7b56b39ccc53bc7515ae1b2533f1e3d

Can we quote these only for msvc target triples? Can we add a comment about the quoting be necessary for those targets?

I assume we could limit this only to -msvc targets, but I feel like that would unnecessarily complicate the code as there's no harm in quoting the names always.

It's still not clear to me how tests using explicit -linux-gnu triples could fail on -mscv hosts.

These tests don't specify a -linux-gnu triple. They are C++ and end up using the default target, which in this case is -msvc.

samitolvanen added inline comments.Jul 20 2021, 4:32 PM
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
90

I was going to say, the proper way to do this is what MCSymbol::print does:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/MC/MCSymbol.cpp#L59

That code doesn't seem exhaustive, but at least it escapes quotes. We can't call MC from here due to library layering.

Is it actually possible to have function names that contain quotes? If so, I suppose we need to do something similar here and escape any quotes in the names.

The LLVM IR readability is poor, but inline asm in IR is already hard to read. I wouldn't worry about that, I'd mainly worry about the output of -S.

The promotion happens only when we write bitcode, so the aliases won't yet exist in the -S output.

samitolvanen planned changes to this revision.Jul 20 2021, 4:59 PM

As we only care about fixing inline assembly references, mangled names are
not that important in the first place. This version skips any functions
that have unusual characters in their names that would otherwise require
quotes, which includes any functions with MSVC compatible name mangling.

This revision is now accepted and ready to land.Jul 21 2021, 3:17 PM
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
39

What about MCAsmInfoXCOFF::isAcceptableChar? I guess we could be targeting a XCOFF object file format with LTO?

45

Can llvm::any_of or llvm::none_of be used here?
llvm/ADT/STLExtras.h

samitolvanen added inline comments.Jul 30 2021, 11:21 AM
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
39

Sure, I'll drop '$' and '@' from the list to play nice with XCOFF.

45

Maybe, but I don't see how they would make this function any cleaner. Did you have something specific in mind?

Also skip functions with names incompatible with XCOFF.

llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
45

Something like?

return any_of(Name, [](const char &C) { return isAlnum(C) || C == '_' || C == '.'; }

or maybe we need !none_of(...)? (not sure if characters of a string can be enumerated this way)

samitolvanen added inline comments.Jul 30 2021, 12:20 PM
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
45

Since we want to ensure all the characters in the string match the predicate, I believe it would make sense to use all_of() instead.

However, I don't see the point of introducing additional complexity to such a trivial function to shave off a couple of lines of code. While it might not actually matter here, using all_of() also seems to generate ~5x as many instructions to execute:

https://godbolt.org/z/Pndfxj6rM

If you feel like the current version is too long, I can drop one line by changing the loop to use:

if (!isAlnum(C) && C != '_' && C != '.')
  return false;

I initially wanted to keep the test identical to MCAsmInfo::isAcceptableChar() to make it easier to see it actually matches, but since it's no longer identical, I suppose that doesn't matter.

Thoughts?

rnk added a comment.Jul 30 2021, 12:47 PM

I think the new approach of skipping non C-ish identifier names is reasonable. Looks good to me, but wait for the more active reviewers to stamp it.

llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
45

using all_of() also seems to generate ~5x as many instructions to execute:

Hard to argue with that.

Might be nice to add some test coverage of this code with an mscv target triple though.

nickdesaulniers accepted this revision.Jul 30 2021, 12:49 PM

Mentioning it here in case others run into the same thing: We bisected a 7x (!) binary size regression to this. Details at https://bugs.chromium.org/p/chromium/issues/detail?id=1261715