This is an archive of the discontinued LLVM Phabricator instance.

Pass subtarget feature "force-align-stack"
ClosedPublic

Authored by ahatanak on Aug 6 2015, 12:06 PM.

Details

Summary

This patch makes changes to pass subtarget feature "force-align-stack" instead of passing a backend-option when users provide "-mstackrealign" on the command line.

The llvm-side change is here: http://reviews.llvm.org/D11814

Diff Detail

Repository
rL LLVM

Event Timeline

ahatanak updated this revision to Diff 31466.Aug 6 2015, 12:06 PM
ahatanak retitled this revision from to Pass subtarget feature "force-align-stack".
ahatanak updated this object.
ahatanak added reviewers: echristo, dexonsmith.
ahatanak added a subscriber: cfe-commits.
hfinkel requested changes to this revision.Aug 13 2015, 5:02 PM
hfinkel added a reviewer: hfinkel.
hfinkel added a subscriber: hfinkel.

As I've said in the review for D11814, this should be added to TargetOptions, and controlled accordingly.

This revision now requires changes to proceed.Aug 13 2015, 5:02 PM
echristo edited edge metadata.Aug 13 2015, 5:04 PM

Hi Hal,

No, TargetOptions is exactly the wrong place to handle this due to wanting to have various functions compiled with and without a force aligned stack at the IR level that might not hold up at LTO time.

-eric

Hi Hal,

No, TargetOptions is exactly the wrong place to handle this due to wanting to have various functions compiled with and without a force aligned stack at the IR level that might not hold up at LTO time.

No, that's not a problem. The code in lib/Target/TargetMachine.cpp (specifically via the RESET_OPTION macro), resets the target options based on the function attributes.

-eric

No, RESET_OPTION isn't the right way to do this either. For exactly this sort of reason. You can't actually represent all of the code and options this way in the IR. If you can't do that then it's a non-starter.

-eric

No, RESET_OPTION isn't the right way to do this either. For exactly this sort of reason. You can't actually represent all of the code and options this way in the IR. If you can't do that then it's a non-starter.

Can you please be more specific, what is it that we cannot represent? TargetOptions represents the target-generic code-generation options for the current function (where the "current function" bit works because the options get reset based on the current function attributes). That's the design we have now. And sticking with that pattern, even if we're going to change the overall scheme later, is better than the code duplication and inconsistency proposed here.

-eric

ahatanak updated this revision to Diff 32592.Aug 19 2015, 12:02 PM
ahatanak edited edge metadata.

This patch makes changes to record function attribute "force-align-stack" instead of recording it as a subtarget feature.

hfinkel added inline comments.Aug 27 2015, 4:45 PM
lib/Driver/Tools.cpp
4232 ↗(On Diff #32592)

The code below for OPT_mstackrealign uses -mstackrealign as the name of the backend option too. Why not do the same for OPT_mstackrealign (use -mstackrealign as the name of the backend option) instead of inventing a new flag name -force-align-stack?

echristo added inline comments.Aug 27 2015, 5:13 PM
lib/Driver/Tools.cpp
4232 ↗(On Diff #32592)

In general we don't do that. But I also don't want this to use a backend option anyhow, why are we doing that here once we have the attribute?

hfinkel added inline comments.Aug 27 2015, 5:20 PM
lib/Driver/Tools.cpp
4232 ↗(On Diff #32592)

It's not a backend option, this is the flag being passed from the driver to clang -cc1.

echristo added inline comments.Aug 27 2015, 5:23 PM
lib/Driver/Tools.cpp
4232 ↗(On Diff #32592)

Aha. I must have misread. Then I totally agree :)

ahatanak added inline comments.Aug 27 2015, 5:27 PM
lib/Driver/Tools.cpp
4232 ↗(On Diff #32592)

I believe the confusing part here is that the CC1 option "-mstackrealign" in the code below indicates stack realignment is allowed, but not necessarily forced (see the definition of StackRealignment in CodeGenOptions.def). This is different from the driver option options::OPT_mstackrealign, which indicates stack alignment should be forced.

Does that answer your question?

hfinkel added inline comments.Aug 27 2015, 5:55 PM
lib/Driver/Tools.cpp
4232 ↗(On Diff #32592)

Yes, but this somehow makes things seem even more broken. As I understand it, we have two underlying CodeGen options here:

  1. May the backend realign the stack if it thinks that it should? [Mainly because it has some overaligned local variable to put on the stack]. This is on by default.
  2. Must the backend realign all functions. This is off by default.

GCC has an option -mstackrealign documented to mean only (2). But we currently use its inverse (-mno-stackrealign) to mean the inverse of (1). Frankly, (1) seems like a debugging option, and I don't see why we are exposing it to users. If we have overaligned locals, than the backend should realign the stack. Always. Otherwise the code is broken. Then we can use -mstackrealign for its intended purpose of only meaning (2), and -mno-stackrealign the inverse of (2).

ahatanak added inline comments.Aug 28 2015, 11:37 AM
lib/Driver/Tools.cpp
4232 ↗(On Diff #32592)

I think you are right. gcc's -mno-stackrealign doesn't disallow stack realignment. It cancels a -mstackrealign preceding it, but has no effect if it appears alone on the command line.

I'll upload a new patch shortly.

ahatanak updated this revision to Diff 33456.Aug 28 2015, 1:10 PM
ahatanak edited edge metadata.

This updated patch changes the handling of driver option -mstackrealign/-mno-stackrealign. -mno-stackrealign no longer indicates stack realignment should be disallowed and clang no longer attaches function attribute "no-realign-stack". The option is only used to cancel -mstackrealign on the command line. The cc1 option "-mstackrealign" now means "force stack realignment" rather than "allow stack realignment" and causes function attribute "stackrealign" to be attached to the functions in the IR.

The cc1 option "-mstackrealign" now means "force stack realignment" rather than "allow stack realignment" and causes function attribute "stackrealign" to be attached to the functions in the IR.

Please, correct me if I'm wrong but I believe that the -force-align-stack option. which was removed in D11814, was x86-specific (in the sense that it was only really tested in X86) and almost always accompanied by a -stack-alignment value that was equal or greater than the requirements of the ABI.

With this change the -mstackrealign option will attach the "stackrealign" attribute on every function definition (and declaration) without specifying a valid -stack-alignment value.

I suspect that this option will be broken for every Target that relies only on the maximum alignment provided by MachineFrameInfo (see X86FrameLowering::calculateMaxStackAlign() for the correct way to handle this).

Is this the intended behaviour here? I'm a little bit cautious because this option would be exposed from the Clang frontend and our users would generate bad code if they were to try this option.

For example, on a Mips target, where the O32 ABI requires either way an 8-byte alignment, we would generate redundant code for realigning the stack to a 4-byte alignment if a function contains objects with maximum alignment of 4-bytes (see attached files to get an idea).

The cc1 option "-mstackrealign" now means "force stack realignment" rather than "allow stack realignment" and causes function attribute "stackrealign" to be attached to the functions in the IR.

Please, correct me if I'm wrong but I believe that the -force-align-stack option. which was removed in D11814, was x86-specific (in the sense that it was only really tested in X86) and almost always accompanied by a -stack-alignment value that was equal or greater than the requirements of the ABI.

With this change the -mstackrealign option will attach the "stackrealign" attribute on every function definition (and declaration) without specifying a valid -stack-alignment value.

-force-align-stack used to be an x86 option but has been a target independent option since r242727 (see the discussion in http://reviews.llvm.org/D11160). The purpose of this patch is to make sure -mstackrealign works when doing LTO. It is not intended to change the way -mstackrealign is handled by clang and llvm (except that -mno-stackrealign has a different meaning now).

I suspect that this option will be broken for every Target that relies only on the maximum alignment provided by MachineFrameInfo (see X86FrameLowering::calculateMaxStackAlign() for the correct way to handle this).

Is this the intended behaviour here? I'm a little bit cautious because this option would be exposed from the Clang frontend and our users would generate bad code if they were to try this option.

I'm not sure if this is the intended behavior, but it looks like it deviates from gcc's -mstackrealign.

For example, on a Mips target, where the O32 ABI requires either way an 8-byte alignment, we would generate redundant code for realigning the stack to a 4-byte alignment if a function contains objects with maximum alignment of 4-bytes (see attached files to get an idea).

According to the link below, originally -mstackrealign was used to keep the stack aligned to 16-bytes when there was a possibility that the calling function's stack was only 4-byte aligned. For mips, I don't know what the right thing to do is, but its seems pointless to align the stack to 4-bytes when the default alignment is 8-byte.

https://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/i386-and-x86_002d64-Options.html

For example, on a Mips target, where the O32 ABI requires either way an 8-byte alignment, we would generate redundant code for realigning the stack to a 4-byte alignment if a function contains objects with maximum alignment of 4-bytes (see attached files to get an idea).

I wonder if there is a target or a use case that requires or prefers realigning the stack to an alignment that is smaller than the default stack alignment. If there is no such target or use case, I think we can just using the existing attribute StackAlignment (with value 0) rather than adding a new function attribute "stackrealign", which will ensure the stack is at least aligned to the default value and force realigning the stack.

For example, on a Mips target, where the O32 ABI requires either way an 8-byte alignment, we would generate redundant code for realigning the stack to a 4-byte alignment if a function contains objects with maximum alignment of 4-bytes (see attached files to get an idea).

I wonder if there is a target or a use case that requires or prefers realigning the stack to an alignment that is smaller than the default stack alignment. If there is no such target or use case, I think we can just using the existing attribute StackAlignment (with value 0) rather than adding a new function attribute "stackrealign", which will ensure the stack is at least aligned to the default value and force realigning the stack.

I was thinking something similar. I'm not quite sure but the only case I can think of, where we might get away with an alignment smaller than the default, is when we emit code for the fast calling convention. Other than that, I believe that we should duplicate the functionality of X86FrameLowering::calculateMaxStackAlign() (or override it in each target accordingly).

The purpose of this patch is to make sure -mstackrealign works when doing LTO

Out of curiosity, I was wondering why dynamic stack realignment isn't enough for LTO. I would guess that the alignment of the data types used under SSE might have a smaller alignment than the one required by the ABI.

The purpose of this patch is to make sure -mstackrealign works when doing LTO

Out of curiosity, I was wondering why dynamic stack realignment isn't enough for LTO. I would guess that the alignment of the data types used under SSE might have a smaller alignment than the one required by the ABI.

I'm not sure if I understood your question, but if you are asking why -mstackrealign doesn't work when doing LTO, the reason is much simpler. In order to force stack realignment in the backend, ForceStackAlign has to be set to true, but that doesn't happen because -force-align-stack isn't passed to libLTO (or passed as a plugin option if gold is being used).

Hal, do you have any thoughts on the points Vasileios brought up? Currently, many of the targets don't guarantee that the realigned stack is at least as aligned as the default alignment required by the ABI. Is this the behavior end-users expect when they use -mstackrealign on the command line?Fixing this is beyond the initial scope of this patch and probably should be done in a separate patch, but I want to make sure the patch I'll end up committing won't make it harder to fix this problem.

Hal, do you have any thoughts on the points Vasileios brought up? Currently, many of the targets don't guarantee that the realigned stack is at least as aligned as the default alignment required by the ABI. Is this the behavior end-users expect when they use -mstackrealign on the command line?

I don't think this is expected behavior, and sounds like a bug.

Fixing this is beyond the initial scope of this patch and probably should be done in a separate patch, but I want to make sure the patch I'll end up committing won't make it harder to fix this problem.

Agreed. I don't see how this makes it harder.

hfinkel accepted this revision.Sep 9 2015, 11:02 PM
hfinkel edited edge metadata.

Hal, do you have any thoughts on the points Vasileios brought up? Currently, many of the targets don't guarantee that the realigned stack is at least as aligned as the default alignment required by the ABI. Is this the behavior end-users expect when they use -mstackrealign on the command line?

I don't think this is expected behavior, and sounds like a bug.

To be more specific, my understanding of the use case (which is the use case for this I've come across myself), is that you need to compile code for a system that does not actually provide the stack alignment that LLVM believes should be guaranteed by the current target ABI. This can happen if the alignment guarantee has changed in the past, and you need to run on the older systems.

Also, on PowerPC, for example, the current LLVM implementation might relaign to using a lower-than-target-ABI alignment when we force realignment, but this should just be a noop.

In any case, this now LGTM.

Fixing this is beyond the initial scope of this patch and probably should be done in a separate patch, but I want to make sure the patch I'll end up committing won't make it harder to fix this problem.

Agreed. I don't see how this makes it harder.

This revision is now accepted and ready to land.Sep 9 2015, 11:02 PM

OK, thanks. I'll go ahead and commit this patch and the llvm-side patch.

This revision was automatically updated to reflect the committed changes.