This is an archive of the discontinued LLVM Phabricator instance.

[SPARC] Fix register reuse in leaf function remapping step
AbandonedPublic

Authored by LemonBoy on May 12 2021, 11:05 AM.

Details

Summary

Leaf functions allow the compiler to omit the setup and teardown of a frame pointer, therefore avoiding the exchange of the in/out register. According to the SPARC architecture manual every reference to %i0-%i5 should be replaced with %o0-o5, if the target register is already in use a further remapping step to %g1-%g7 is required to free the output register.

Add a simple check to make sure not to stomp on any output register that's already in use.

Diff Detail

Event Timeline

LemonBoy created this revision.May 12 2021, 11:05 AM
LemonBoy requested review of this revision.May 12 2021, 11:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 12 2021, 11:05 AM

Ping. Anything else that is need to be done for this?

Hello!

It seems that this patch contains a rewrite of the remapRegsForLeafProc() function in addition to the new check described in the summary. Could you split it up into two patches, one with the rewrite and one with the added check?

Is the problem that the isLeafProc() does not detect the case when inline assembly clobbers one of the output registers or have you been able to trigger it in some other way? Does it make sense to put the extra check in isLeafProc() instead?

llvm/lib/Target/Sparc/SparcFrameLowering.cpp
394–396

Remove these tab characters.

Hello!

It seems that this patch contains a rewrite of the remapRegsForLeafProc() function in addition to the new check described in the summary. Could you split it up into two patches, one with the rewrite and one with the added check?

I'm not sure about this. The way I see it is that the checks need to be done together with the rewrite, so it probably cannot be split like that (?)

Is the problem that the isLeafProc() does not detect the case when inline assembly clobbers one of the output registers

No, the problem is that on functions with more than six parameters, when leaf function optimization is turned on, the function might be compiled in such a way that the loading of 7th-and-further parameters corrupts the value of other parameters.
This does not happen on functions with six or less parameters, even with inline assembly inside.

For example, on current LLVM, the testcase in the patch compiles to this:

leaf_proc_give_up:
    ld [%sp+2227], %o5 # The original value of %o5 is lost here
                       # causing the inline assembly to run with
                       # the wrong register values.
    mov     %o0, %g1
    mov     %o1, %o0
    mov     %o2, %o1
    mov     %o3, %o2
    mov     %o4, %o3
    mov     %o5, %o4
    !APP
    !NO_APP
    retl
    nop

As far as I can tell, there are three solutions for this: (1) reorder the code so the loads happen after the movs, (2) load the parameter to a temporary register first then move it to %o5 after all the other movs are done, or (3) simply bail out of the optimization entirely, as is done in this patch.

or have you been able to trigger it in some other way?

So far the only way I know to trigger this is with inline assembly (we encountered this when we're trying to write a system call wrapper for Linux).

Does it make sense to put the extra check in isLeafProc() instead?

I guess it is possible to do that by flagging all functions that has more than six parametes as nonleaf (so the optimization wouldn't be triggered), but I'm afraid that it'd be too aggressive.

Hello!

It seems that this patch contains a rewrite of the remapRegsForLeafProc() function in addition to the new check described in the summary. Could you split it up into two patches, one with the rewrite and one with the added check?

I'm not sure about this. The way I see it is that the checks need to be done together with the rewrite, so it probably cannot be split like that (?)

The only functional change I can see is the check at line 345-348, and that should be in SparcFrameLowering::isLeafProc(). Then you do not need to rewrite determineCalleeSaves(). Or is there some functional change that I'm missing?

isLeafProc() checks if %L0 is used to determine if any of the %L or %O registers are used since the %I registers are allocated first. But it breaks if inline assembly code uses any of %L or %O. I ran into this bug before and fixed it by adding some code in tryInlineAsm() to mark the function as ineligible for leaf optimization if any of the %L or %O registers were used. So that is an alternative approach. But this approach is fine be me, but the check should be in isLeafProc().

But this approach is fine be me, but the check should be in isLeafProc().

The section H.12 of the SPARC Architecture manual outlines the whole register remapping process that requires the re-allocation of %oN to free %gN registers if needed, I didn't implement the whole thing because of time/interest constraints but as it's written it should be relatively easy to add the missing steps.

Ping.

The section H.12 of the SPARC Architecture manual outlines the whole register remapping process that requires the re-allocation of %oN to free %gN registers if needed, I didn't implement the whole thing because of time/interest constraints but as it's written it should be relatively easy to add the missing steps.

I tried to implement this, but as far as I understand it, it's impossible (or at least very hard) to do when it's already this late because even if we manage to put the load result in a %gN register, there's no way to emit an extra mov to get the value into %o5.
Also, as an additional note, functions that does not contain inline assembly generates the correct code on unpatched LLVM, so I believe the existing register remapping is already mostly working as intended.

I propose keeping the remapRegsForLeafProc() as is and add a check to isLeafProc() so that it refuses to optimize a function if it contains inline assembly instead. To me, it seems that it's a less invasive option, what do you think?

Ping? Anything else needs to be done for this?

Herald added a project: Restricted Project. · View Herald TranscriptMar 21 2022, 5:30 AM
brad added a comment.Jun 18 2022, 5:24 PM

Ping? Anything else needs to be done for this?

This needs to be rebased. Since you had a proposal for additional changes did you want to post your own diff?

LemonBoy abandoned this revision.Jun 19 2022, 1:38 AM