This is an archive of the discontinued LLVM Phabricator instance.

Add -f[no-]direct-access-external-data to supersede -mpie-copy-relocations
ClosedPublic

Authored by MaskRay on Dec 3 2020, 10:00 PM.

Details

Summary

GCC r218397 "x86-64: Optimize access to globals in PIE with copy reloc" made
-fpie code emit R_X86_64_PC32 to reference external data symbols by default.
Clang adopted -mpie-copy-relocations D19996 as a flexible alternative.

The name -mpie-copy-relocations can be improved [1] and does not capture the
idea that this option can apply to -fno-pic and -fpic [2], so this patch
introduces -f[no-]direct-access-external-data and makes -mpie-copy-relocations
their aliases for compatibility.

[1]
For

extern int var;
int get() { return var; }

if var is defined in another translation unit in the link unit, there is no copy
relocation.

[2]
-fno-pic -fno-direct-access-external-data is useful to avoid copy relocations.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65888
If a shared object is linked with -Bsymbolic or --dynamic-list and exports a
data symbol, normally the data symbol cannot be accessed by -fno-pic code
(because by default an absolute relocation is produced which will lead to a copy
relocation). -fno-direct-access-external-data can prevent copy relocations.

-fpic -fdirect-access-external-data can avoid GOT indirection. However, the
user should be responsible for defining var in another translation unit and link
with -Bsymbolic or --dynamic-list, otherwise the linker will error in a -shared
link. Generally the user has better tools for their goal but I want to mention that
this combination is valid.

On COFF, the behavior is like always -fdirect-access-external-data.
__declspec(dllimport) is needed to enable indirect access.

There is currently no plan to affect non-ELF behavior or -fpic behavior.

-fno-pic -fno-direct-access-external-data will be implemented in the subsequent patch.

GCC feature request https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112

Diff Detail

Unit TestsFailed

Event Timeline

MaskRay created this revision.Dec 3 2020, 10:00 PM
MaskRay requested review of this revision.Dec 3 2020, 10:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 3 2020, 10:00 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

You said : "The name -mpie-copy-relocations is misleading [1] and does not capture the idea that this option can actually apply to all of -fno-pic,-fpie, ..."

Could you please clarify why this option needs to apply to -fno-pic? Here is what I tried with trunk clang:

extern int var;
int get() { return var; }

$ clang -S foo.c -o -
....
movl var, %eax
popq %rbp
...

With -fno-pic, this will never need to use -mpie-copy-relocations at all, so this case is not relevant right? Did I miss anything?

You said : "The name -mpie-copy-relocations is misleading [1] and does not capture the idea that this option can actually apply to all of -fno-pic,-fpie, ..."

Could you please clarify why this option needs to apply to -fno-pic? Here is what I tried with trunk clang:

If the user wants to guarantee no copy relocations in -fno-pic code, they can theoretically apply -fno-direct-access-external-data to use a GOT indirection.
This is not implemented, though.

extern int var;
int get() { return var; }

$ clang -S foo.c -o -
....
movl var, %eax
popq %rbp
...

With -fno-pic, this will never need to use -mpie-copy-relocations at all, so this case is not relevant right? Did I miss anything?

-fno-pic code can only be used with -no-pie links (position-dependent executables) If var is not defined in the linked executable, it will have a copy relocation.

You said : "The name -mpie-copy-relocations is misleading [1] and does not capture the idea that this option can actually apply to all of -fno-pic,-fpie, ..."

Could you please clarify why this option needs to apply to -fno-pic? Here is what I tried with trunk clang:

If the user wants to guarantee no copy relocations in -fno-pic code, they can theoretically apply -fno-direct-access-external-data to use a GOT indirection.
This is not implemented, though.

extern int var;
int get() { return var; }

$ clang -S foo.c -o -
....
movl var, %eax
popq %rbp
...

With -fno-pic, this will never need to use -mpie-copy-relocations at all, so this case is not relevant right? Did I miss anything?

-fno-pic code can only be used with -no-pie links (position-dependent executables) If var is not defined in the linked executable, it will have a copy relocation.

Thanks for explaining. I know that by default (i.e. no-pic and no-pie), copy relocations will be used for external data accesses. So, you are saying that you are adding a mechanism to disable copy relocations for the no-pic/no-pie case too. Is there a need for this, purely a question. I know that copy relocations are frowned upon so maybe there was a feature request. If so, citing that would make it more clear. Thanks.

MaskRay added a comment.EditedDec 4 2020, 10:10 AM

You said : "The name -mpie-copy-relocations is misleading [1] and does not capture the idea that this option can actually apply to all of -fno-pic,-fpie, ..."

Could you please clarify why this option needs to apply to -fno-pic? Here is what I tried with trunk clang:

If the user wants to guarantee no copy relocations in -fno-pic code, they can theoretically apply -fno-direct-access-external-data to use a GOT indirection.
This is not implemented, though.

extern int var;
int get() { return var; }

$ clang -S foo.c -o -
....
movl var, %eax
popq %rbp
...

With -fno-pic, this will never need to use -mpie-copy-relocations at all, so this case is not relevant right? Did I miss anything?

-fno-pic code can only be used with -no-pie links (position-dependent executables) If var is not defined in the linked executable, it will have a copy relocation.

Thanks for explaining. I know that by default (i.e. no-pic and no-pie), copy relocations will be used for external data accesses. So, you are saying that you are adding a mechanism to disable copy relocations for the no-pic/no-pie case too. Is there a need for this, purely a question. I know that copy relocations are frowned upon so maybe there was a feature request. If so, citing that would make it more clear. Thanks.

I don't know of a feature request.

Is there a need for this, purely a question.

A weak need: more to make the behavior more align with the option name and perhaps align with potential GCC behavior (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 if they decide to implement this).
It can also be argued that this can be used to remove a special case for ppc64 ELFv2, which essentially defaults to -fno-direct-access-external-data for -fno-pic (they prefer to not have copy relocations)

Supporting -fno-direct-access-external-data for -fno-pic is not difficult: a one/two line change to TargetMachine::shouldAssumeDSOLocal refining if (!(GV && GV->isThreadLocal()) && RM == Reloc::Static).

LGTM. I checked it completely and the patch makes total sense to me.

clang/include/clang/Driver/Options.td
1866

Would -fdirect-access-external-global be a more appropriate name?, data sounds generic to me but I don't have a strong opinion.

LGTM. I checked it completely and the patch makes total sense to me.

Thanks! The name bothers me a bit. There are two parts "direct-access" (direct access relocations like absolute relocations and PC-relative relocations) "external-data" (this is for an external global data symbol with default visibility) (-fno-plt is a similar option for function symbols)

"external global data symbol with default visibility" is apparently too long and not suitable for an option name.
I do not know how to fit default into the name, so I just omit it.
I think external implies global (undefined local symbols are not allowed, either by the assembler or by the linker in various binary formats), so external-global probably does not convey more information than external-data. In the end I still think -fdirect-access-external-data is the best name I can think of...

Sorry just one more thing which is a bit concerning:

When I do :

$ clang -fPIC -frxternal-data-access foo.c

You will omit the GOT but this object can go into a shared object and break the build as this does not apply to shared objects? Should we allow this at all for -fPIC? I dont see a need for this but if you want to, maybe warn? The user should know that this cannot go into a shared object right?
I am also ok with commenting that this option can do bad things for -fPIC and the user should know what they are doing.

LGTM. I checked it completely and the patch makes total sense to me.

Thanks! The name bothers me a bit. There are two parts "direct-access" (direct access relocations like absolute relocations and PC-relative relocations) "external-data" (this is for an external global data symbol with default visibility) (-fno-plt is a similar option for function symbols)

"external global data symbol with default visibility" is apparently too long and not suitable for an option name.
I do not know how to fit default into the name, so I just omit it.
I think external implies global (undefined local symbols are not allowed, either by the assembler or by the linker in various binary formats), so external-global probably does not convey more information than external-data. In the end I still think -fdirect-access-external-data is the best name I can think of...

Acknowledged. To me, global is more specific but I do not want to push this any further. Feel free to keep the current choice :)

Sorry just one more thing which is a bit concerning:

When I do :

$ clang -fPIC -frxternal-data-access foo.c

You will omit the GOT but this object can go into a shared object and break the build as this does not apply to shared objects? Should we allow this at all for -fPIC? I dont see a need for this but if you want to, maybe warn? The user should know that this cannot go into a shared object right?
I am also ok with commenting that this option can do bad things for -fPIC and the user should know what they are doing.

clang -fPIC -fdirect-access-eternal-data -c a.c; ld -no-pie a.o; ld -pie a.o (producing an executable with either -no-pie or -pie) is fine.

For -shared, because an ELF linker assumes interposition for non-local STV_DEFAULT symbols by default, linking such an object files requires some mechanism to make it non-preemptible.
The simplest is clang -fPIC -fdirect-access-external-data -c a.c; ld -shared -Bsymbolic a.o
Other mechanisms include: --dynamic-list (don't list the symbol) and --version-script (localize the symbol).
This is going to be tricky and I don't know how to fit the information into the few-line description of the option.

I just checked how to make -fdirect-access-eternal-data work for -fno-pic (both TargetMachine::shouldAssumeDSOLocal and CodeGenModule.cpp:shouldAssumeDSOLocal should not assume false for Reloc::Static), unfortunately there are 200 tests needing updates. (This reminds me that LLVM doesn't get the default for dso_local right, so many tests have dso_local in ELF/COFF but no dso_local in Mach-O. Unfortunately it is extremely difficult to fix it today (D76561))

Sorry just one more thing which is a bit concerning:

When I do :

$ clang -fPIC -frxternal-data-access foo.c

You will omit the GOT but this object can go into a shared object and break the build as this does not apply to shared objects? Should we allow this at all for -fPIC? I dont see a need for this but if you want to, maybe warn? The user should know that this cannot go into a shared object right?
I am also ok with commenting that this option can do bad things for -fPIC and the user should know what they are doing.

clang -fPIC -fdirect-access-eternal-data -c a.c; ld -no-pie a.o; ld -pie a.o (producing an executable with either -no-pie or -pie) is fine.

Yes, agreed. Putting this object into an executable (pie/no-pie) no matter how you compile it is always fine as long as the backend supports copy relocations. No issues there.

For -shared, because an ELF linker assumes interposition for non-local STV_DEFAULT symbols by default, linking such an object files requires some mechanism to make it non-preemptible.

Right, that was my point. Without -fPIC, we can be guaranteed that it won't go into a shared object and this case wouldn't arise.

The simplest is clang -fPIC -fdirect-access-external-data -c a.c; ld -shared -Bsymbolic a.o
Other mechanisms include: --dynamic-list (don't list the symbol) and --version-script (localize the symbol).
This is going to be tricky and I don't know how to fit the information into the few-line description of the option.

How about making this option applicable for -fpie/fPIE and the default case and *not allowing* it for -fPIC/-fpic? That seems the safest approach.

I just checked how to make -fdirect-access-eternal-data work for -fno-pic (both TargetMachine::shouldAssumeDSOLocal and CodeGenModule.cpp:shouldAssumeDSOLocal should not assume false for Reloc::Static), unfortunately there are 200 tests needing updates. (This reminds me that LLVM doesn't get the default for dso_local right, so many tests have dso_local in ELF/COFF but no dso_local in Mach-O. Unfortunately it is extremely difficult to fix it today (D76561))

Ok, I lost you here a bit. This should be fine for -fno-pic was my understanding.

Let me try to summarize the motivations of this patch:

  1. Originally, the default build (fno-pie/fno-pic), always used direct access for external globals resulting in copy relocations if it is truly external at link time. You want to change that to be able to not have copy relocations with -fno-direct-access-external-data, and you claim this would be useful for POWER, great!
  2. Originally, with -fPIE and -fpie, -mpie-copy-relocations allowed direct access to externals. This was always safe because the object was guaranteed to go into an executable. The new option preserves this behavior so there is no concern.
  3. Originally, there was no way to generate direct accesses to external global variables with -fPIC/-fpic. That behavior will change if you support -fdirect-access-external-data with -fPIC. That is my concern that this adds to the complexity and the user should know what they are doing.

Given 3), I am wondering if you really need this patch. I will leave it to you but please do consider the fact that opening up this option to -fPIC might be a problem.

Sorry just one more thing which is a bit concerning:

When I do :

$ clang -fPIC -frxternal-data-access foo.c

You will omit the GOT but this object can go into a shared object and break the build as this does not apply to shared objects? Should we allow this at all for -fPIC? I dont see a need for this but if you want to, maybe warn? The user should know that this cannot go into a shared object right?
I am also ok with commenting that this option can do bad things for -fPIC and the user should know what they are doing.

clang -fPIC -fdirect-access-eternal-data -c a.c; ld -no-pie a.o; ld -pie a.o (producing an executable with either -no-pie or -pie) is fine.

Yes, agreed. Putting this object into an executable (pie/no-pie) no matter how you compile it is always fine as long as the backend supports copy relocations. No issues there.

For -shared, because an ELF linker assumes interposition for non-local STV_DEFAULT symbols by default, linking such an object files requires some mechanism to make it non-preemptible.

Right, that was my point. Without -fPIC, we can be guaranteed that it won't go into a shared object and this case wouldn't arise.

The simplest is clang -fPIC -fdirect-access-external-data -c a.c; ld -shared -Bsymbolic a.o
Other mechanisms include: --dynamic-list (don't list the symbol) and --version-script (localize the symbol).
This is going to be tricky and I don't know how to fit the information into the few-line description of the option.

How about making this option applicable for -fpie/fPIE and the default case and *not allowing* it for -fPIC/-fpic? That seems the safest approach.

I just checked how to make -fdirect-access-eternal-data work for -fno-pic (both TargetMachine::shouldAssumeDSOLocal and CodeGenModule.cpp:shouldAssumeDSOLocal should not assume false for Reloc::Static), unfortunately there are 200 tests needing updates. (This reminds me that LLVM doesn't get the default for dso_local right, so many tests have dso_local in ELF/COFF but no dso_local in Mach-O. Unfortunately it is extremely difficult to fix it today (D76561))

Ok, I lost you here a bit. This should be fine for -fno-pic was my understanding.

Let me try to summarize the motivations of this patch:

  1. Originally, the default build (fno-pie/fno-pic), always used direct access for external globals resulting in copy relocations if it is truly external at link time. You want to change that to be able to not have copy relocations with -fno-direct-access-external-data, and you claim this would be useful for POWER, great!
  2. Originally, with -fPIE and -fpie, -mpie-copy-relocations allowed direct access to externals. This was always safe because the object was guaranteed to go into an executable. The new option preserves this behavior so there is no concern.

Yes for 1 and 2. This patch only makes the options work for -fpie (as the original -mpie-copy-relocations does).

1 will be a nice cleanup (to drop 2 lines from TargetMachine::shouldAssumeDSOLocal) but it may require some test updates and it looks like CodeGen/MachineCSE.cpp exposes a crashing bug that it cannot handle non-dso_local external constant in -relocation-model=static mode correctly...

  1. Originally, there was no way to generate direct accesses to external global variables with -fPIC/-fpic. That behavior will change if you support -fdirect-access-external-data with -fPIC. That is my concern that this adds to the complexity and the user should know what they are doing.

Given 3), I am wondering if you really need this patch. I will leave it to you but please do consider the fact that opening up this option to -fPIC might be a problem.

I agree. The behavior is not touched in this patch.
I think the existing -mpie-copy-relocations users should be aware that the option (-fdirect-access-external-data) should generally only be used with -fno-pic and -fpie.

Sorry just one more thing which is a bit concerning:

When I do :

$ clang -fPIC -frxternal-data-access foo.c

You will omit the GOT but this object can go into a shared object and break the build as this does not apply to shared objects? Should we allow this at all for -fPIC? I dont see a need for this but if you want to, maybe warn? The user should know that this cannot go into a shared object right?
I am also ok with commenting that this option can do bad things for -fPIC and the user should know what they are doing.

clang -fPIC -fdirect-access-eternal-data -c a.c; ld -no-pie a.o; ld -pie a.o (producing an executable with either -no-pie or -pie) is fine.

Yes, agreed. Putting this object into an executable (pie/no-pie) no matter how you compile it is always fine as long as the backend supports copy relocations. No issues there.

For -shared, because an ELF linker assumes interposition for non-local STV_DEFAULT symbols by default, linking such an object files requires some mechanism to make it non-preemptible.

Right, that was my point. Without -fPIC, we can be guaranteed that it won't go into a shared object and this case wouldn't arise.

The simplest is clang -fPIC -fdirect-access-external-data -c a.c; ld -shared -Bsymbolic a.o
Other mechanisms include: --dynamic-list (don't list the symbol) and --version-script (localize the symbol).
This is going to be tricky and I don't know how to fit the information into the few-line description of the option.

How about making this option applicable for -fpie/fPIE and the default case and *not allowing* it for -fPIC/-fpic? That seems the safest approach.

I just checked how to make -fdirect-access-eternal-data work for -fno-pic (both TargetMachine::shouldAssumeDSOLocal and CodeGenModule.cpp:shouldAssumeDSOLocal should not assume false for Reloc::Static), unfortunately there are 200 tests needing updates. (This reminds me that LLVM doesn't get the default for dso_local right, so many tests have dso_local in ELF/COFF but no dso_local in Mach-O. Unfortunately it is extremely difficult to fix it today (D76561))

Ok, I lost you here a bit. This should be fine for -fno-pic was my understanding.

Let me try to summarize the motivations of this patch:

  1. Originally, the default build (fno-pie/fno-pic), always used direct access for external globals resulting in copy relocations if it is truly external at link time. You want to change that to be able to not have copy relocations with -fno-direct-access-external-data, and you claim this would be useful for POWER, great!
  2. Originally, with -fPIE and -fpie, -mpie-copy-relocations allowed direct access to externals. This was always safe because the object was guaranteed to go into an executable. The new option preserves this behavior so there is no concern.

Yes for 1 and 2. This patch only makes the options work for -fpie (as the original -mpie-copy-relocations does).

1 will be a nice cleanup (to drop 2 lines from TargetMachine::shouldAssumeDSOLocal) but it may require some test updates and it looks like CodeGen/MachineCSE.cpp exposes a crashing bug that it cannot handle non-dso_local external constant in -relocation-model=static mode correctly...

  1. Originally, there was no way to generate direct accesses to external global variables with -fPIC/-fpic. That behavior will change if you support -fdirect-access-external-data with -fPIC. That is my concern that this adds to the complexity and the user should know what they are doing.

Given 3), I am wondering if you really need this patch. I will leave it to you but please do consider the fact that opening up this option to -fPIC might be a problem.

I agree. The behavior is not touched in this patch.

Correct me if I am wrong, but I do see that this behavior is touched. Line 10 in -fdirect-access-external-data.c :

// RUN: %clang -### -c -target aarch64 %s -fpic -fdirect-access-external-data 2>&1 | FileCheck %s --check-prefix=DIRECT

With -fpic, the variable access will go directly and not via GOT.

I think the existing -mpie-copy-relocations users should be aware that the option (-fdirect-access-external-data) should generally only be used with -fno-pic and -fpie.

Correct me if I am wrong, but I do see that this behavior is touched. Line 10 in -fdirect-access-external-data.c :

// RUN: %clang -### -c -target aarch64 %s -fpic -fdirect-access-external-data 2>&1 | FileCheck %s --check-prefix=DIRECT

With -fpic, the variable access will go directly and not via GOT.

clang/test/Driver/fdirect-access-external-data.c is a driver test which tests how the driver passes options to CC1.

clang/lib/CodeGen/CodeGenModule.cpp says how the CC1 option affects the dso_local specifier in the LLVM IR output. Currently it is a no op for -fpic/-fPIC.

(I have made some tests. It looks like implementing -fno-direct-access-external-data for -fno-pic is not an insurmountable work: ~50 tests)

Correct me if I am wrong, but I do see that this behavior is touched. Line 10 in -fdirect-access-external-data.c :

// RUN: %clang -### -c -target aarch64 %s -fpic -fdirect-access-external-data 2>&1 | FileCheck %s --check-prefix=DIRECT

With -fpic, the variable access will go directly and not via GOT.

clang/test/Driver/fdirect-access-external-data.c is a driver test which tests how the driver passes options to CC1.

clang/lib/CodeGen/CodeGenModule.cpp says how the CC1 option affects the dso_local specifier in the LLVM IR output. Currently it is a no op for -fpic/-fPIC.

Great! Sorry I didnt realize this, this is fine with me now!

(I have made some tests. It looks like implementing -fno-direct-access-external-data for -fno-pic is not an insurmountable work: ~50 tests)

Correct me if I am wrong, but I do see that this behavior is touched. Line 10 in -fdirect-access-external-data.c :

// RUN: %clang -### -c -target aarch64 %s -fpic -fdirect-access-external-data 2>&1 | FileCheck %s --check-prefix=DIRECT

With -fpic, the variable access will go directly and not via GOT.

clang/test/Driver/fdirect-access-external-data.c is a driver test which tests how the driver passes options to CC1.

clang/lib/CodeGen/CodeGenModule.cpp says how the CC1 option affects the dso_local specifier in the LLVM IR output. Currently it is a no op for -fpic/-fPIC.

Great! Sorry I didnt realize this, this is fine with me now!

(I have made some tests. It looks like implementing -fno-direct-access-external-data for -fno-pic is not an insurmountable work: ~50 tests)

May I get a formal LGTM? 😋 I am happy to give GCC devs a few more days... https://gcc.gnu.org/pipermail/gcc/2021-January/234639.html (the previous feature request and gcc/ mailing list message has given them one month and -f[no-]direct-access-external-data is still the best name so far)

tmsriram accepted this revision.Jan 7 2021, 6:16 PM

Correct me if I am wrong, but I do see that this behavior is touched. Line 10 in -fdirect-access-external-data.c :

// RUN: %clang -### -c -target aarch64 %s -fpic -fdirect-access-external-data 2>&1 | FileCheck %s --check-prefix=DIRECT

With -fpic, the variable access will go directly and not via GOT.

clang/test/Driver/fdirect-access-external-data.c is a driver test which tests how the driver passes options to CC1.

clang/lib/CodeGen/CodeGenModule.cpp says how the CC1 option affects the dso_local specifier in the LLVM IR output. Currently it is a no op for -fpic/-fPIC.

Great! Sorry I didnt realize this, this is fine with me now!

(I have made some tests. It looks like implementing -fno-direct-access-external-data for -fno-pic is not an insurmountable work: ~50 tests)

May I get a formal LGTM? 😋 I am happy to give GCC devs a few more days... https://gcc.gnu.org/pipermail/gcc/2021-January/234639.html (the previous feature request and gcc/ mailing list message has given them one month and -f[no-]direct-access-external-data is still the best name so far)

Ok, I can approve this.

This revision is now accepted and ready to land.Jan 7 2021, 6:16 PM
MaskRay retitled this revision from Add -f[no-]direct-access-external-data to deprecate -mpie-copy-relocations to Add -f[no-]direct-access-external-data to supersede -mpie-copy-relocations.Jan 9 2021, 12:27 AM
MaskRay edited the summary of this revision. (Show Details)
This revision was landed with ongoing or failed builds.Jan 9 2021, 12:32 AM
This revision was automatically updated to reflect the committed changes.