This is an archive of the discontinued LLVM Phabricator instance.

[cfi] Implement cfi-icall using inline assembly.
ClosedPublic

Authored by eugenis on Oct 24 2016, 3:30 PM.

Details

Reviewers
pcc
echristo
Summary

The current implementation is emitting a global constant that happens
to evaluate to the same bytes + relocation as a jump instruction on
X86. This does not work for PIE executables and shared libraries
though, because we end up with a wrong relocation type. And it has no
chance of working on ARM/AArch64 which use different relocation types
for jump instructions (R_ARM_JUMP24) that is never generated for
data.

This change replaces the constant with module-level inline assembly
followed by a hidden declaration of the jump table. Works fine for
ARM/AArch64, but has some drawbacks.

  • Extra symbols are added to the static symbol table, which inflate

the size of the unstripped binary a little. Stripped binaries are not
affected. This happens because jump table declarations must be
external (because their body is in the inline asm).

  • Original functions that were anonymous are now named

<original name>.cfi, and it affects symbolization sometimes. This is
necessary because the only user of these functions is the (inline
asm) jump table, so they had to be added to @llvm.used, which does
not allow unnamed functions.

Diff Detail

Repository
rL LLVM

Event Timeline

eugenis updated this revision to Diff 75654.Oct 24 2016, 3:30 PM
eugenis retitled this revision from to [cfi] Implement cfi-icall using inline assembly..
eugenis updated this object.
eugenis added reviewers: pcc, echristo.
eugenis set the repository for this revision to rL LLVM.
eugenis added a subscriber: llvm-commits.
pcc added inline comments.Oct 24 2016, 5:12 PM
lib/Transforms/IPO/LowerTypeTests.cpp
669

Could be simpler to have a single AsmOS that everyone writes to, then a single appendModuleInlineAsm at the end?

687

Don't you want to copy the linkage from the original function here?

736

http://llvm-cs.pcc.me.uk/?q=setSection.*llvm.metadata

We have code like this all over the place. May I ask you to add a general utility function for this?

829

This could just be Int8Ty since the type isn't important any more.

eugenis added inline comments.Oct 25 2016, 2:19 PM
lib/Transforms/IPO/LowerTypeTests.cpp
736
eugenis updated this revision to Diff 75810.Oct 25 2016, 4:23 PM
eugenis marked 2 inline comments as done.
eugenis added inline comments.
lib/Transforms/IPO/LowerTypeTests.cpp
687

done, I think

829

The type matters for the GEP at line 830

pcc added inline comments.Oct 25 2016, 5:56 PM
lib/Transforms/IPO/LowerTypeTests.cpp
684

I wonder whether we want to escape these names? This appears to be the escaping we'd need to use:
http://llvm-cs.pcc.me.uk/lib/MC/MCSymbol.cpp#53

Maybe it's not necessary though since we pretty much expect all names we see to satisfy MCAsmInfo::isValidUnquotedName.

687

Also weak?

829

Sorry, yes, thanks.

pcc added inline comments.Oct 25 2016, 7:25 PM
lib/Transforms/IPO/LowerTypeTests.cpp
218

You no longer need this variable.

eugenis marked an inline comment as done.Oct 26 2016, 2:23 PM
eugenis added inline comments.
lib/Transforms/IPO/LowerTypeTests.cpp
684

Looks like we should escape the names.
It's easy to call a function "\n" with asm in the declaration, and that breaks the jumptable code.

eugenis updated this revision to Diff 75966.Oct 26 2016, 4:39 PM

PTAL

lib/Transforms/IPO/LowerTypeTests.cpp
684

It appears that neither llvm-mc nor gnu as support quoted identifier properly.
Gnu as allows a quoted string in, for example, .globl directive. Llvm-mc does too, but it forgets to unescape it, so, say, "\n" turns into a 2-character string when saves as textual asm and read back.

Neither support quoted strings in expressions we need to declare aliases, like

"\n" = .jumptable + 4.

I think I'll just make it a fatal error to apply cfi-icall to any name that requires escaping.

pcc added inline comments.Oct 26 2016, 5:12 PM
lib/Transforms/IPO/LowerTypeTests.cpp
684

Makes sense, I suppose. Though I think you'll also need to apply IR mangling (Mangler::getNameWithPrefix).

736

Maybe isValidAsmUnquotedName or something?

738

This function should have a FIXME to replace the inline asm with some better representation.

test/Transforms/LowerTypeTests/function.ll
1โ€“5

Maybe use check-prefixes here to reduce duplication below?

eugenis updated this revision to Diff 75973.Oct 26 2016, 6:22 PM
eugenis marked 3 inline comments as done.
eugenis added inline comments.
test/Transforms/LowerTypeTests/function.ll
1โ€“5

I'd rather not. It will make an unreadable mess of x86, arm and common lines

pcc added inline comments.Oct 26 2016, 6:40 PM
lib/Transforms/IPO/LowerTypeTests.cpp
671

Although I think it doesn't matter in this case, it would probably be better to use the overload that takes a GlobalValue.

eugenis added inline comments.Oct 27 2016, 1:01 PM
lib/Transforms/IPO/LowerTypeTests.cpp
671

That one is not static. It can assign temporary labels for unnamed globals, but to use that we would need to access the instance in TargetLoweringObjectFile, not create our own.

echristo added inline comments.Oct 27 2016, 1:05 PM
lib/Transforms/IPO/LowerTypeTests.cpp
671

This is why I was mentioning wanting TLOF for this in general. Effectively you can get two different sets of labels if you try the "calling Mangler myself" path.

I'd rather initialize/split TLOF early on the TM rather than have lots of different mechanisms to get symbol names.

pcc added inline comments.Oct 27 2016, 1:07 PM
lib/Transforms/IPO/LowerTypeTests.cpp
671

But you're renaming unnamed globals elsewhere? With that just a temporary instance should be enough.

eugenis added inline comments.Oct 27 2016, 1:16 PM
lib/Transforms/IPO/LowerTypeTests.cpp
671

Actually, the current code does not handle unnamed functions. I'll fix it.

Eric, could you elaborate on the "initialize/split TLOF early on the TM"?

eugenis updated this revision to Diff 76090.Oct 27 2016, 1:39 PM
eugenis updated this revision to Diff 76092.Oct 27 2016, 1:45 PM
eugenis added inline comments.
lib/Transforms/IPO/LowerTypeTests.cpp
671

Renaming unnamed globals now.
Switched to non-static Mangler methods - they seem to do something special on Windows; we probably want that.

pcc accepted this revision.Oct 27 2016, 2:02 PM
pcc edited edge metadata.

LGTM if Eric is ok with this.

Eric: in the long term do we want the mangler exposed to Transforms? Ideally Transforms should just be dealing with GlobalValues, correct? If that's the case, it doesn't seem worth threading TLOF through to here.

This revision is now accepted and ready to land.Oct 27 2016, 2:02 PM
eugenis updated this revision to Diff 76249.Oct 28 2016, 2:50 PM
eugenis edited edge metadata.

Added hidden visibility to CFI aliases to internal functions.
I'm not sure why aarch64 gold does not relax GOT relocations against local symbols into a simple PC-relative offset, but that's how it is.

Eric, are you OK with this?

eugenis updated this revision to Diff 76485.Oct 31 2016, 2:54 PM

Switched from .p2align to .balign and reduced arm/aarch64 jump table alignment to 4.

echristo edited edge metadata.Oct 31 2016, 3:23 PM
In D25927#581379, @pcc wrote:

LGTM if Eric is ok with this.

Eric: in the long term do we want the mangler exposed to Transforms? Ideally Transforms should just be dealing with GlobalValues, correct? If that's the case, it doesn't seem worth threading TLOF through to here.

Long term it's interesting - I think if you're creating named symbols in a Transform we're going to need to plug this through, and TLOF is a better option for doing that than everyone calling various (different) versions of the Mangler/TLOF/TM current API that I'm cleaning up.

-eric

pcc added a comment.Oct 31 2016, 3:31 PM
In D25927#581379, @pcc wrote:

LGTM if Eric is ok with this.

Eric: in the long term do we want the mangler exposed to Transforms? Ideally Transforms should just be dealing with GlobalValues, correct? If that's the case, it doesn't seem worth threading TLOF through to here.

Long term it's interesting - I think if you're creating named symbols in a Transform we're going to need to plug this through, and TLOF is a better option for doing that than everyone calling various (different) versions of the Mangler/TLOF/TM current API that I'm cleaning up.

My point is that eventually we will come to some consensus on an IR-level representation for jump tables, so we would not need to mangle symbols here.

rjmccall added inline comments.
lib/Transforms/IPO/LowerTypeTests.cpp
880โ€“889

This variable is dead now.

eugenis updated this revision to Diff 76634.Nov 1 2016, 2:30 PM
eugenis edited edge metadata.
echristo accepted this revision.Nov 10 2016, 6:03 PM
echristo edited edge metadata.

Since my Mangler/TLOF work isn't finished yet, for now go ahead with this. A couple of requests: a) add a comment/assert that global symbols aren't allowed (we don't want to collide with others) and b) that we should be using the one off of TLOF if possible.

Thanks!

-eric

eugenis updated this revision to Diff 77583.Nov 10 2016, 6:10 PM
eugenis edited edge metadata.

added a FIXME and an assert for the mangler vs TLOF issue.

eugenis closed this revision.Nov 28 2016, 1:31 PM

r286611, thanks

lib/Transforms/IPO/LowerTypeTests.cpp
880โ€“889

and it's gone