This is an archive of the discontinued LLVM Phabricator instance.

LTO: add a code-model flag
ClosedPublic

Authored by martell on Feb 2 2017, 5:03 AM.

Details

Summary

This creates the same flag for gold and LLD for the ELF linker for consistency.
Values supported include default, small kernel, medium large.
If the flag is not passed it results in the existing "default" value.

Diff Detail

Repository
rL LLVM

Event Timeline

martell created this revision.Feb 2 2017, 5:03 AM

Looks like the diff is missing context. Also, is there any chance of any tests?

tools/gold/gold-plugin.cpp
672

Is there anywhere we could put this function so that we could share it with ELF/Driver.cpp to prevent essentially duplicate code?

679

Should this function emit an error if CodeModel is not recognised? As far as I can see, this will silently accept incorrect switches, which could be harmful (e.g. a typo leads to passing in "meidum" or similar).

martell updated this revision to Diff 86798.Feb 2 2017, 6:27 AM

Apologies @jhenderson I did have a test case but it never got included by git diff
Here is an updated version which also includes the context of every file.
I am open to suggestions on how best not to have code duplication.

Updated to raise relevant questions and try to address comments

tools/gold/gold-plugin.cpp
672

Based on the changes I tracked to update this patch from 3.9.0 to master a lot of the shared code between gold and lld has move into lib/LTO/LTOBackend.cpp
I'm not sure if this is the right place for argument parsing though.
I am open to suggestions?

679

The only way I could see emitting an error code initially here was adding the following
.Case("default", llvm::CodeModel::Default)
.Default(llvm::CodeModel::JITDefault);
Then error in the case where it is JITDefault.
This seems very hacky and not the correct way to do this.

Adding an llvm_unreachable seems the correct way to go but error is used in lld.
The error message should generally be the same across gold and lld ELF.
I'll wait on suggestions to where we can not have duplication and fix this together in one go.

ruiu edited reviewers, added: ruiu; removed: rui314.Feb 2 2017, 10:14 AM
ruiu added a reviewer: pcc.
pcc edited edge metadata.Feb 2 2017, 10:26 AM

I believe that in both linkers we can just write Config->CodeModel = CMModel; which would use the existing flag in CommandFlags.h.

In D29445#664931, @pcc wrote:

I believe that in both linkers we can just write Config->CodeModel = CMModel; which would use the existing flag in CommandFlags.h.

I believe this could be useful for setting the default.

Example:

llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t
ld.lld %t -o %t2

This would result in the code model being what CMModel was set for x86_64-unknown-linux

I would like to ask
What happens in the following scenario?

clang -target=x86_64-unknown-linux -mcmodel=large %s -o %t
ld.lld %t -o %t2

Will CMModel be the default for that target like above or is the large model information in the object there for the linker to infer?

To follow up with @jhenderson also about the unification on the switch statements.
The same functionality also exists in clang in lib/CodeGen/BackendUtil.cpp in EmitAssemblyHelper::CreateTargetMachine
Again suggestions on how best to unify would be greatly welcomed.

martell updated this revision to Diff 86962.Feb 3 2017, 7:29 AM

Update to error on an invalid code model for the gold plugin

pcc added a comment.Feb 3 2017, 8:57 AM
In D29445#664931, @pcc wrote:

I believe that in both linkers we can just write Config->CodeModel = CMModel; which would use the existing flag in CommandFlags.h.

I believe this could be useful for setting the default.

Example:

llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t
ld.lld %t -o %t2

This would result in the code model being what CMModel was set for x86_64-unknown-linux

I would like to ask
What happens in the following scenario?

clang -target=x86_64-unknown-linux -mcmodel=large %s -o %t
ld.lld %t -o %t2

Will CMModel be the default for that target like above or is the large model information in the object there for the linker to infer?

I don't think the code model information will be passed correctly from the compiler to the linker in that case. CMModel will be the value of the existing -code-model flag, which will be CodeModel::Default by default (same as the default in lto::Config, see http://llvm-cs.pcc.me.uk/include/llvm/CodeGen/CommandFlags.h#81). You can pass it to lld with (e.g.) -mllvm -code-model=large or to gold with -plugin-opt -code-model=large.

If we wanted to pass the code model from the compiler to the linker, I think we would need to introduce a function attribute that controls the code model. The compiler would apply the attribute to every function in the translation unit and it would make it to the linker via bitcode.

To follow up with @jhenderson also about the unification on the switch statements.
The same functionality also exists in clang in lib/CodeGen/BackendUtil.cpp in EmitAssemblyHelper::CreateTargetMachine
Again suggestions on how best to unify would be greatly welcomed.

Sorry for the delayed resposne. I don't know LLVM's layout well yet, but to me, the obvious place to put a function to do this would seem to be somewhere in the same library as where the enum is defined, i.e. the Support library of LLVM, (or possibly in the CodeGen library, since the enum is in CodeGen.h). Looking at the library I don't see an obvious candidate file though. Maybe it would be best to create a new one?

With now four instances of identical functionality, something is bound to go wrong sooner or later, if we don't de-duplicate it.

pcc added a comment.Feb 7 2017, 10:53 AM

To follow up with @jhenderson also about the unification on the switch statements.
The same functionality also exists in clang in lib/CodeGen/BackendUtil.cpp in EmitAssemblyHelper::CreateTargetMachine
Again suggestions on how best to unify would be greatly welcomed.

Sorry for the delayed resposne. I don't know LLVM's layout well yet, but to me, the obvious place to put a function to do this would seem to be somewhere in the same library as where the enum is defined, i.e. the Support library of LLVM, (or possibly in the CodeGen library, since the enum is in CodeGen.h). Looking at the library I don't see an obvious candidate file though. Maybe it would be best to create a new one?

With now four instances of identical functionality, something is bound to go wrong sooner or later, if we don't de-duplicate it.

As I have repeatedly mentioned, we can re-use the flag in CommandFlags.h.

martell updated this revision to Diff 87570.Feb 7 2017, 5:34 PM

Updated to add a test case for gold

I don't think the code model information will be passed correctly from the compiler to the linker in that case. CMModel will be the value of the existing -code-model flag, which will be CodeModel::Default by default (same as the default in lto::Config, see http://llvm-cs.pcc.me.uk/include/llvm/CodeGen/CommandFlags.h#81). You can pass it to lld with (e.g.) -mllvm -code-model=large or to gold with -plugin-opt -code-model=large.

I think I am missing something here. How does Config->CodeModel = CMModel; get all the way into the gold plugin for parsing. I had to add an option to process_plugin_option to pull out the string.
Is there something within cl::opt that has the magic to check the string against CMModel?
I don't currently see an example of this being used in that manner anywhere.
Hints are very welcome :)

pcc added a comment.Feb 7 2017, 5:59 PM

I don't think the code model information will be passed correctly from the compiler to the linker in that case. CMModel will be the value of the existing -code-model flag, which will be CodeModel::Default by default (same as the default in lto::Config, see http://llvm-cs.pcc.me.uk/include/llvm/CodeGen/CommandFlags.h#81). You can pass it to lld with (e.g.) -mllvm -code-model=large or to gold with -plugin-opt -code-model=large.

I think I am missing something here. How does Config->CodeModel = CMModel; get all the way into the gold plugin for parsing. I had to add an option to process_plugin_option to pull out the string.

In the gold plugin any options that the plugin does not recognize are passed through to cl::ParseCommandLineOptions, see http://llvm-cs.pcc.me.uk/tools/gold/gold-plugin.cpp#234

Is there something within cl::opt that has the magic to check the string against CMModel?

Yes, any cl::opts of enum type are checked during ParseCommandLineOptions. The command line library will produce an error message if the option string does not match any of the enumerators.

In D29445#670264, @pcc wrote:

I think I am missing something here. How does Config->CodeModel = CMModel; get all the way into the gold plugin for parsing. I had to add an option to process_plugin_option to pull out the string.

In the gold plugin any options that the plugin does not recognize are passed through to cl::ParseCommandLineOptions, see http://llvm-cs.pcc.me.uk/tools/gold/gold-plugin.cpp#234

Is there something within cl::opt that has the magic to check the string against CMModel?

Yes, any cl::opts of enum type are checked during ParseCommandLineOptions. The command line library will produce an error message if the option string does not match any of the enumerators.

Great thanks for clearing that up @pcc
@rafael your message to the thread never ended up on phab, I didn't miss it though I added a gold testcase as you asked. The StringSwitch should not be necessary though if we use CommandFlags.

Using CommandFlags.h seems to create a conflict because it is included in LTO.cpp in lld also.
Ignoring the issue of Target being ambitious which is easily solved by specifying the lld::elf type, the link time error for duplicate symbols is not.

lib/liblldELF.a(LTO.cpp.o):(.bss._Z7ABINameB5cxx11+0x0): multiple definition of `ABIName[abi:cxx11]'
lib/liblldELF.a(Driver.cpp.o):(.bss._Z7ABINameB5cxx11+0x0): first defined here

What would be the suggested path here.
Moving createLTO into Driver.cpp does not seem like an ideal solution to me?

pcc added a comment.Feb 7 2017, 6:43 PM
In D29445#670264, @pcc wrote:

I think I am missing something here. How does Config->CodeModel = CMModel; get all the way into the gold plugin for parsing. I had to add an option to process_plugin_option to pull out the string.

In the gold plugin any options that the plugin does not recognize are passed through to cl::ParseCommandLineOptions, see http://llvm-cs.pcc.me.uk/tools/gold/gold-plugin.cpp#234

Is there something within cl::opt that has the magic to check the string against CMModel?

Yes, any cl::opts of enum type are checked during ParseCommandLineOptions. The command line library will produce an error message if the option string does not match any of the enumerators.

Great thanks for clearing that up @pcc
@rafael your message to the thread never ended up on phab, I didn't miss it though I added a gold testcase as you asked. The StringSwitch should not be necessary though if we use CommandFlags.

Using CommandFlags.h seems to create a conflict because it is included in LTO.cpp in lld also.
Ignoring the issue of Target being ambitious which is easily solved by specifying the lld::elf type, the link time error for duplicate symbols is not.

lib/liblldELF.a(LTO.cpp.o):(.bss._Z7ABINameB5cxx11+0x0): multiple definition of `ABIName[abi:cxx11]'
lib/liblldELF.a(Driver.cpp.o):(.bss._Z7ABINameB5cxx11+0x0): first defined here

What would be the suggested path here.
Moving createLTO into Driver.cpp does not seem like an ideal solution to me?

I would suggest adding a function to lld/lib/Core/TargetOptionsCommandFlags.cpp that returns CMModel, and then use that function in ELF/LTO.cpp to set Conf.CodeModel.

martell updated this revision to Diff 88124.Feb 12 2017, 7:30 AM

Updated to reflect guidance from @pcc

! @pcc wrote:
I would suggest adding a function to lld/lib/Core/TargetOptionsCommandFlags.cpp that returns CMModel, and then use that function in ELF/LTO.cpp to set Conf.CodeModel.

Ahh this was added only a few days ago. I wasn't working on latest HEAD so I didn't see that and ended up creating it before realising. :)
This should be good to merge now I think?

martell updated this revision to Diff 88141.Feb 12 2017, 10:15 AM

The diff was malformed

pcc accepted this revision.Feb 12 2017, 2:03 PM

The code changes look good, but would it be possible to change the test cases to verify that the output uses the correct code model?

tools/lld/ELF/Config.h
16

Remove this line.

tools/lld/lib/Core/TargetOptionsCommandFlags.cpp
30 ↗(On Diff #88141)

Suggest renaming to GetCommandLineCodeModel.

This revision is now accepted and ready to land.Feb 12 2017, 2:03 PM
martell updated this revision to Diff 88174.Feb 13 2017, 3:13 AM

Updated for post acceptance feedback

In D29445#674672, @pcc wrote:

The code changes look good, but would it be possible to change the test cases to verify that the output uses the correct code model?

I don't have a test case for that off hand, the user who was using this had a large table of data and code model large was not being respected by the linker.
I don't have access to that source file though. Looking through llvm and clang I can't initially find anything that does that either.
The clang test seems to be very similar to what I already have clang/test/Driver/code-model.c

pcc added a comment.Feb 13 2017, 11:10 AM

It looks like you could adapt the test case llvm/test/CodeGen/X86/codemodel.ll.

In D29445#675329, @pcc wrote:

It looks like you could adapt the test case llvm/test/CodeGen/X86/codemodel.ll.

I created a test case based on llvm/test/CodeGen/X86/codemodel.ll

; REQUIRES: x86
; RUN: llvm-as %s -o %t.o
; RUN: ld.lld -m elf_x86_64 %t.o -o %ts.so -mllvm -code-model=small -shared
; RUN: ld.lld -m elf_x86_64 %t.o -o %tk.so -mllvm -code-model=kernel -shared
; RUN: llvm-objdump -d %ts.so -o - | FileCheck %s --check-prefix=SMALL
; RUN: llvm-objdump -d %tk.so -o - | FileCheck %s --check-prefix=KERNEL

target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128"
target triple = "x86_64-unknown-linux-gnu"
@data = external global [0 x i32] ; <[0 x i32]*> [#uses=5]

define i32 @foo() nounwind readonly {
entry:
; CHECK-SMALL-LABEL:  foo:
; CHECK-SMALL:   movl 4233(%rip), %eax
; CHECK-KERNEL-LABEL: foo:
; CHECK-KERNEL:  movl 4233, %eax
    %0 = load i32, i32* getelementptr ([0 x i32], [0 x i32]* @data, i64 0, i64 0), align 4 ; <i32> [#uses=1]
    ret i32 %0
}

This is the result I am getting from the llvm-objdump stage

codemodels.so:  file format ELF64-x86-64
Disassembly of section .text:
foo:
    1000:       48 8b 05 89 10 00 00    movq    4233(%rip), %rax
    1007:       8b 00   movl    (%rax), %eax
    1009:       c3      retq

codemodelk.so:  file format ELF64-x86-64
Disassembly of section .text:
foo:
    1000:       48 8b 05 89 10 00 00    movq    4233(%rip), %rax
    1007:       8b 00   movl    (%rax), %eax
    1009:       c3      retq

The assumption I am making is that llvm-as only converts llvm-ir in a text format to a binary format.
So as a result I should be able to use the code model passed after -mllvm to decide what is generated.
I also tried changing the target to i686 or i386 with no joy.

Just for a test I did pass -code-model=large and I get errors
ld.lld: error: ld-temp.o:(function foo): can't create dynamic relocation R_X86_64_64 against symbol 'data' defined in codemodel.o
this makes sense because of @data = external global [0 x i32] ; <[0 x i32]*> [#uses=5]

I got the following test case working as test/ELF/lto/codemodel.ll

; REQUIRES: x86
; RUN: llvm-as %s -o %t.o
; RUN: ld.lld -m elf_x86_64 %t.o -o %ts -e foo --lto-O0 -mllvm -code-model=small
; RUN: ld.lld -m elf_x86_64 %t.o -o %tl -e foo --lto-O0 -mllvm -code-model=large
; RUN: llvm-objdump -d %ts | FileCheck %s --check-prefix=CHECK-SMALL
; RUN: llvm-objdump -d %tl | FileCheck %s --check-prefix=CHECK-LARGE

target triple = "x86_64-unknown-linux-gnu"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

@data = internal constant [0 x i32] []

define i32 @foo() nounwind readonly {
entry:
; CHECK-SMALL-LABEL:  foo:
; CHECK-SMALL: movl    -3814(%rip), %eax
; CHECK-LARGE-LABEL: foo:
; CHECK-LARGE: movabsq $2097440, %rax
; CHECK-LARGE: movl    (%rax), %eax
    %0 = load i32, i32* getelementptr ([0 x i32], [0 x i32]* @data, i64 0, i64 0), align 4
    ret i32 %0
}

@pcc do you want me to keep test/ELF/code-model.s also ?
Also do you have any issues you have with the above testcase?

This revision was automatically updated to reflect the committed changes.