This is an archive of the discontinued LLVM Phabricator instance.

[ClangFE] Add support for option -mno-pic-data-is-text-relative
ClosedPublic

Authored by jonpa on Oct 30 2022, 12:11 PM.

Details

Summary

(Patch in progress)

Add support for this GCC option which has the purpose of disallowing text relative accesses of module local symbols with PIC.

I am fumbling a bit with this and would like to ask for some help as I have no good idea of how this is best done. My idea so far has been to

  • patch adjustGVALinkageForAttributes() to change the Linkage to External for these GlobalValues.
  • Add Driver option but I could not make this work all the way into CompilerInvocation.cpp. The round-trip check of the arguments is failing... :-/

Diff Detail

Event Timeline

jonpa created this revision.Oct 30 2022, 12:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 30 2022, 12:11 PM
Herald added a subscriber: StephenFan. · View Herald Transcript
jonpa requested review of this revision.Oct 30 2022, 12:11 PM

Can you describe the motivation adding this option? Is it for a FDPIC ABI? Removing the assumption that text and data has a constant offset generally make the code worse for data access. This is almost assuredly a bad idea if the system has a memory management unit.

Can you describe the motivation adding this option? Is it for a FDPIC ABI? Removing the assumption that text and data has a constant offset generally make the code worse for data access. This is almost assuredly a bad idea if the system has a memory management unit.

This is intended to be used for "kpatch" - the Linux kernel live-patching facility. This allows replacing parts of the kernel code with an updated version at runtime. However, the *data* segment cannot be moved since it is already in use (and pointers to the data may be around), and so the replacement text segment will have to refer to the original data segment. Since the replacement text segment will generally end up at different addresses than the original one, this means the compiler cannot make assumptions about the relative distance from code to data.

Of course this in general results in worse code, but that is considered an acceptable cost to enable the live-patching feature. Note that GCC already supports the same option with the same semantics, used for the same purpose. The intent of supporting it in clang as well is simply to optionally enabling building the Linux kernel with clang instead of GCC while preserving the full set of kernel features.

Can you describe the motivation adding this option? Is it for a FDPIC ABI? Removing the assumption that text and data has a constant offset generally make the code worse for data access. This is almost assuredly a bad idea if the system has a memory management unit.

This is intended to be used for "kpatch" - the Linux kernel live-patching facility. This allows replacing parts of the kernel code with an updated version at runtime. However, the *data* segment cannot be moved since it is already in use (and pointers to the data may be around), and so the replacement text segment will have to refer to the original data segment. Since the replacement text segment will generally end up at different addresses than the original one, this means the compiler cannot make assumptions about the relative distance from code to data.

Of course this in general results in worse code, but that is considered an acceptable cost to enable the live-patching feature. Note that GCC already supports the same option with the same semantics, used for the same purpose. The intent of supporting it in clang as well is simply to optionally enabling building the Linux kernel with clang instead of GCC while preserving the full set of kernel features.

I know the existence of livepatch in the Linux kernel though unfamiliar with it. The kernel doesn't use -mno-pic-data-is-text-relative right now for any port. Why does SystemZ need -mno-pic-data-is-text-relative for livepatch?

For work-in-progress patches, there is a convention to use [WIP] or a similar tag. [ClangFE] has never been used for Clang patches.
If you want to ask how to implement this, you may raise a thread on discourse. I think quite a lot of places need to be changed to support -mno-pic-data-is-text-relative.

Can you describe the motivation adding this option? Is it for a FDPIC ABI? Removing the assumption that text and data has a constant offset generally make the code worse for data access. This is almost assuredly a bad idea if the system has a memory management unit.

This is intended to be used for "kpatch" - the Linux kernel live-patching facility. This allows replacing parts of the kernel code with an updated version at runtime. However, the *data* segment cannot be moved since it is already in use (and pointers to the data may be around), and so the replacement text segment will have to refer to the original data segment. Since the replacement text segment will generally end up at different addresses than the original one, this means the compiler cannot make assumptions about the relative distance from code to data.

Of course this in general results in worse code, but that is considered an acceptable cost to enable the live-patching feature. Note that GCC already supports the same option with the same semantics, used for the same purpose. The intent of supporting it in clang as well is simply to optionally enabling building the Linux kernel with clang instead of GCC while preserving the full set of kernel features.

I know the existence of livepatch in the Linux kernel though unfamiliar with it. The kernel doesn't use -mno-pic-data-is-text-relative right now for any port. Why does SystemZ need -mno-pic-data-is-text-relative for livepatch?

This option is not present in the kernel sources, but it is used by the kpatch tool when building patch modules:
https://github.com/dynup/kpatch/blob/master/kpatch-build/kpatch-build

Whether or not this is needed on any platform depends on the particulars of the ISA, e.g. what is the span reachable via PC-relative instructions.

I'm very confused how you expect this to work... messing with the linkage of global variables is almost certainly going to explode. If you need emit special sequences to access global variables with internal linkage, I'd expect the backend to handle that (along the lines of how -frwpi works on ARM).

I'm very confused how you expect this to work... messing with the linkage of global variables is almost certainly going to explode. If you need emit special sequences to access global variables with internal linkage, I'd expect the backend to handle that (along the lines of how -frwpi works on ARM).

I agree, this needs to be handled in the back end (it's done that way in the GCC implementation too). @jonpa can you have a look?

jonpa updated this revision to Diff 473507.Nov 6 2022, 12:01 PM

Sorry for the confusion... My first idea to make an internal value external was based on the assumption that since no other module would ever actually be aware of it, it should be safe to play around with it and pretend that its linkage is external. Live patching in itself is weird enough to perhaps allow this kind of trick. But I guess then that it wasn't really a good idea at all... (as a newbie here I'm curious on what would break..?)

I tried another approach along your suggestions instead:

  • Be compatible with GCC and accept -mpic-data-is-text-relative and -mno-pic-data-is-text-relative CL options.
  • If Clang Driver gets -mno-pic-data-is-text-relative, it emits -no-pic-data-is-text-relative (just like -pic-level and -pic-is-pie) to the CmdArgs to CC1.
  • If CC1 gets -no-pic-data-is-text-relative, a Module flag is emitted to indicate this (just like "PIC Level" and "PIE Level").
  • If this module attribute is present, the SystemZ backend will access an internal global value via GOT.
Herald added a project: Restricted Project. · View Herald TranscriptNov 6 2022, 12:01 PM
Herald added a subscriber: hiraditya. · View Herald Transcript

Makes sense to me. Maybe someone more familiar with the SystemZ side of things can confirm.

(as a newbie here I'm curious on what would break..?)

Off the top of my head:

  • Other code could create internal globals
  • Name conflict if multiple globals have the same name
  • C++ name mangling might get confused.
llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
3217 ↗(On Diff #473507)

Move this check into isPC32DBLSymbol, maybe?

mnadeem resigned from this revision.Nov 7 2022, 10:43 AM

Making -mpic-data-is-text-relative unrecognized by non-SystemZ targets LG.

clang/test/Driver/pic.c
322

Use --target= for new tests. -target is legacy.

llvm/test/CodeGen/SystemZ/pic-data-is-text-rel.ll
1 ↗(On Diff #473507)

You can use a trick like llvm/Instrumentation/InstrProfiling/comdat.ll to combine two files into one.

32 ↗(On Diff #473507)

delete trailing blank line

Move this check into isPC32DBLSymbol, maybe?

Hmm. This is actually a very interesting point. isPC32DBLSymbol already implements exactly the behavior we want for this option, if -mcmodel=medium or -mcmodel=large is given. I hadn't thought about that.

So maybe instead of implementing a new module flag, the -mno-pic-data-is-text-relative option can be implemented solely in the front end or driver by mapping it to -mcmodel=medium?

(Unfortunately GCC on s390x does not implement -mcmodel, otherwise we might not have even needed the new option ...)

jonpa updated this revision to Diff 474999.Nov 13 2022, 7:17 AM

Emit -mcmodel=medium instead from Driver.

jonpa updated this revision to Diff 475166.Nov 14 2022, 8:55 AM
jonpa marked an inline comment as done.

Check both versions of the option when emitting an error for unsupported option/target.

Off the top of my head:

Other code could create internal globals
Name conflict if multiple globals have the same name
C++ name mangling might get confused.

Ah, I see - that makes sense.

uweigand accepted this revision.Nov 17 2022, 4:40 AM

LGTM, thanks!

This revision is now accepted and ready to land.Nov 17 2022, 4:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 17 2022, 8:33 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

The test clang/test/Driver/pic.c failed when we compiled Clang/LLVM with -DCLANG_DEFAULT_PIE_ON_LINUX=False.

jonpa added a comment.Nov 22 2022, 6:05 AM

The test clang/test/Driver/pic.c failed when we compiled Clang/LLVM with -DCLANG_DEFAULT_PIE_ON_LINUX=False.

Ah... I suppose that must be because -mno-pic-data-is-text-relative requires PIC (this is per GCC behavior). Does it help if you add -fpic to the RUN lines that were added by this patch? I was not aware of the cmake flag, sorry.

The test clang/test/Driver/pic.c failed when we compiled Clang/LLVM with -DCLANG_DEFAULT_PIE_ON_LINUX=False.

Ah... I suppose that must be because -mno-pic-data-is-text-relative requires PIC (this is per GCC behavior). Does it help if you add -fpic to the RUN lines that were added by this patch? I was not aware of the cmake flag, sorry.

For me it does. Can we make this change?

jonpa added a comment.Nov 22 2022, 7:32 AM

The test clang/test/Driver/pic.c failed when we compiled Clang/LLVM with -DCLANG_DEFAULT_PIE_ON_LINUX=False.

Ah... I suppose that must be because -mno-pic-data-is-text-relative requires PIC (this is per GCC behavior). Does it help if you add -fpic to the RUN lines that were added by this patch? I was not aware of the cmake flag, sorry.

For me it does. Can we make this change?

Yes, that should be fine as the test is assuming pic by default. Thanks.

The test clang/test/Driver/pic.c failed when we compiled Clang/LLVM with -DCLANG_DEFAULT_PIE_ON_LINUX=False.

Ah... I suppose that must be because -mno-pic-data-is-text-relative requires PIC (this is per GCC behavior). Does it help if you add -fpic to the RUN lines that were added by this patch? I was not aware of the cmake flag, sorry.

For me it does. Can we make this change?

Yes, that should be fine as the test is assuming pic by default. Thanks.

I pushed the fix - https://github.com/llvm/llvm-project/commit/0ffcd243a42bff13966e455582d2db6c337628a2 . Thank you.