[ARM] ScheduleDAGRRList::DelayForLiveRegsBottomUp must consider OptionalDefs
ClosedPublic

Authored by tyomitch on Mar 17 2017, 7:52 AM.

Details

Summary

D30400 has enabled tADC and tSBC instructions to be unglued, thereby allowing CPSR to remain live between Thumb1 scheduling units.

Most Thumb1 instructions have an OptionalDef for CPSR; but the scheduler ignored the OptionalDefs, and could unwittingly insert a flag-setting instruction in between an ADDS and the corresponding ADC.

Diff Detail

Repository
rL LLVM
tyomitch created this revision.Mar 17 2017, 7:52 AM

Hi :
Please add a commit message explaining the problem you are trying to solve e.g.
"when checking for registers that are live to avoid scheduling - optional definitions such as ARM instructions which can set condition code if 's' bit is set - are mis-diagnosed"
It may be better to write the test as an mir-test, as it would illustrate the problem more clearly.
Thanks.

tyomitch edited the summary of this revision. (Show Details)Mar 17 2017, 10:47 AM

I didn't realize the Thumb1 instructions weren't marked up correctly. :( Please revert D30400 until this is fixed.

This isn't really the right fix; each Thumb1 instruction should either have CPSR as an explicit Def, or list it in ImplicitDefs. Adding the additional check for optional defs might solve your problem here, sort of, but we check for ImplicitDefs in a lot of places, so it's better to get it right.

Eli, do you mean that OptionalDefs should never be used because the scheduler (and perhaps other places) ignore them?

ARM instructions (incl. Thumb1 and Thumb2) have always had OptionalDef operands for CPSR.

Unfortunately, I'm not that familiar with ScheduleDAG, so I'm not sure I can give good guidance. I see two things very wrong with this patch, though. One, it prevents schedules which are perfectly legal for non-Thumb1 code. Two, you're specifically updating one of the schedulers, but they're all affected by this issue.

efriedma set the repository for this revision to rL LLVM.

One, it prevents schedules which are perfectly legal for non-Thumb1 code.

No, it doesn't: for the test case that I'm adding, the Thumb2 output remains a sequence of (adds, eor, eor, adc).
This is because the code I'm adding checks not only the presence but also the actual value of the OptionalDef operand, and its value is %noreg when the instruction doesn't set CPSR.

Oh, okay, now I follow what you're doing, mostly.

I've never seen OptionalDefs before (and the ARM target seems to be the only one using them). Does this mean optional defs allow you to sometimes have an extra definition that otherwise looks like a use?!? Wouldn't we need to adjust ScheduleDAGSDNodes so it reflects the physreg definition in the schedule dag?

OptionalDefOperand is only used by the ARM backend, yes. It's used to avoid creating a bunch of variants of instructions which modify the flag register, CPSR, since almost every arithmetic instruction has an "s" variant. It's a def of CPSR if the S bit is set, otherwise it's a use of %noreg, or something like that, I think. This is complicated by the fact that in Thumb1, the non-S variants don't actually exist, which is why we have this particular problem.

Granted, it would be simpler overall to just multiclass the instructions in question, but it would a big project to change that.

tyomitch added a comment.EditedMar 18 2017, 2:17 AM

It's a def of CPSR if the S bit is set, otherwise it's a use of %noreg, or something like that, I think.

Yes, this is exactly right.

This is complicated by the fact that in Thumb1, the non-S variants don't actually exist, which is why we have this particular problem.

It's slightly more complicated than this: in Thumb1, the instruction is either predicated or has the S bit, so the backend needs to be able to model both cases.

(D30400 is now reverted... but we still want a fix here.)

Shouldn't we commit this now to fix ScheduleDAGRRList, and if/when similar problems are discovered with other schedulers that are used for ARM targets, to fix them then?

Oh... looking a bit more closely, the "fast" scheduler is basically dead, and the VLIW scheduler is obviously irrelevant here. And I guess this is pretty low risk, given that ARM is the only backend using optional defs.

This needs a much better comment explaining why this is necessary (covering the important points discussed in this review). Otherwise, I'm okay with this (but I'm really not familiar with this code, so I'd like someone else to confirm that judgement).


I spent a little time experimenting with multiclassing the Thumb1 instructions to avoid optional defs; it seems like it's viable, and it's substantially simpler in some ways, but it would take a lot of time to finish.

tyomitch updated this revision to Diff 92640.Mar 22 2017, 8:15 AM

Added a comment explaining why this is necessary

This needs a much better comment explaining why this is necessary

Done.

I spent a little time experimenting with multiclassing the Thumb1 instructions to avoid optional defs;
it seems like it's viable, and it's substantially simpler in some ways, but it would take a lot of time to finish.

Yes, that's my own understanding, too.

javed.absar accepted this revision.Mar 23 2017, 2:33 AM

Based on above discussions, LGTM.

This revision is now accepted and ready to land.Mar 23 2017, 2:33 AM

Hi Tim,

James has suggested that you take a look, too.

LGTM from my side. But please wait for someone else to approve it too.

Renato, perhaps you could take a look at this patch?

Hi,

I've spent some time coming up to speed on this in picking it up from Artyom. The million dollar question on this seems to be whether OptionalDefs are a total hack and should be ripped out entirely, or should be kept. I think in the latter case, the patch itself looks correct.

Firstly, while OptionalDef is only used in the ARM backend it is used there pervasively. I saw Eli mention that he'd experimented with multiclassing Thumb1 instructions to get rid of the optionaldefs - but optionaldef isn't just used in Thumb-1, it's used in all ARM-mode instructions and most Thumb-2 instructions too.

I certainly agree that the design of AArch64 is far superior, with separate S and non-S variants of instructions, however I also think that moving the ARM backend to this model is a large project fraught with danger (so has to be well thought out to avoid the strong possibility of a mistranslation).

OptionalDefs also seem to be well-specified - they are simply a def that has a default if none is set. Therefore there should be no bad interactions with the register allocator as Matthias had worried (if there was, we'd have seen this before, it's used so pervasively!)

I'd therefore argue that this patch is correct, fixes a known defect, so should be committed.

Thoughts?

James

Ping? Tim, Renato?

I agree with James, refactoring OptionalDef is a task too big for this patch.

However, the patch itself is quite obscure (what lead here). The comment on the code is good, but the test looks like a random sequence of instructions, if you don't see the commit message (because the comment in the code doesn't explain it).

I'd just add the short commit message as a comment to the test, which will explain why we don't want to see an eors between the two adds.

Otherwise, LGTM.

Thanks!

This revision was automatically updated to reflect the committed changes.