[ARM] ScheduleDAGRRList::DelayForLiveRegsBottomUp must consider OptionalDefs
AcceptedPublic

Authored by tyomitch on Fri, Mar 17, 7:52 AM.

Details

Summary

D30400 has enabled tADC and tSBC instructions to be unglued, thereby allowing CPSR to remain live between Thumb1 scheduling units.

Most Thumb1 instructions have an OptionalDef for CPSR; but the scheduler ignored the OptionalDefs, and could unwittingly insert a flag-setting instruction in between an ADDS and the corresponding ADC.

tyomitch created this revision.Fri, Mar 17, 7:52 AM

Hi :
Please add a commit message explaining the problem you are trying to solve e.g.
"when checking for registers that are live to avoid scheduling - optional definitions such as ARM instructions which can set condition code if 's' bit is set - are mis-diagnosed"
It may be better to write the test as an mir-test, as it would illustrate the problem more clearly.
Thanks.

tyomitch edited the summary of this revision. (Show Details)Fri, Mar 17, 10:47 AM

I didn't realize the Thumb1 instructions weren't marked up correctly. :( Please revert D30400 until this is fixed.

This isn't really the right fix; each Thumb1 instruction should either have CPSR as an explicit Def, or list it in ImplicitDefs. Adding the additional check for optional defs might solve your problem here, sort of, but we check for ImplicitDefs in a lot of places, so it's better to get it right.

Eli, do you mean that OptionalDefs should never be used because the scheduler (and perhaps other places) ignore them?

ARM instructions (incl. Thumb1 and Thumb2) have always had OptionalDef operands for CPSR.

Unfortunately, I'm not that familiar with ScheduleDAG, so I'm not sure I can give good guidance. I see two things very wrong with this patch, though. One, it prevents schedules which are perfectly legal for non-Thumb1 code. Two, you're specifically updating one of the schedulers, but they're all affected by this issue.

efriedma set the repository for this revision to rL LLVM.

One, it prevents schedules which are perfectly legal for non-Thumb1 code.

No, it doesn't: for the test case that I'm adding, the Thumb2 output remains a sequence of (adds, eor, eor, adc).
This is because the code I'm adding checks not only the presence but also the actual value of the OptionalDef operand, and its value is %noreg when the instruction doesn't set CPSR.

Oh, okay, now I follow what you're doing, mostly.

I've never seen OptionalDefs before (and the ARM target seems to be the only one using them). Does this mean optional defs allow you to sometimes have an extra definition that otherwise looks like a use?!? Wouldn't we need to adjust ScheduleDAGSDNodes so it reflects the physreg definition in the schedule dag?

OptionalDefOperand is only used by the ARM backend, yes. It's used to avoid creating a bunch of variants of instructions which modify the flag register, CPSR, since almost every arithmetic instruction has an "s" variant. It's a def of CPSR if the S bit is set, otherwise it's a use of %noreg, or something like that, I think. This is complicated by the fact that in Thumb1, the non-S variants don't actually exist, which is why we have this particular problem.

Granted, it would be simpler overall to just multiclass the instructions in question, but it would a big project to change that.

tyomitch added a comment.EditedSat, Mar 18, 2:17 AM

It's a def of CPSR if the S bit is set, otherwise it's a use of %noreg, or something like that, I think.

Yes, this is exactly right.

This is complicated by the fact that in Thumb1, the non-S variants don't actually exist, which is why we have this particular problem.

It's slightly more complicated than this: in Thumb1, the instruction is either predicated or has the S bit, so the backend needs to be able to model both cases.

(D30400 is now reverted... but we still want a fix here.)

Shouldn't we commit this now to fix ScheduleDAGRRList, and if/when similar problems are discovered with other schedulers that are used for ARM targets, to fix them then?

Oh... looking a bit more closely, the "fast" scheduler is basically dead, and the VLIW scheduler is obviously irrelevant here. And I guess this is pretty low risk, given that ARM is the only backend using optional defs.

This needs a much better comment explaining why this is necessary (covering the important points discussed in this review). Otherwise, I'm okay with this (but I'm really not familiar with this code, so I'd like someone else to confirm that judgement).


I spent a little time experimenting with multiclassing the Thumb1 instructions to avoid optional defs; it seems like it's viable, and it's substantially simpler in some ways, but it would take a lot of time to finish.

tyomitch updated this revision to Diff 92640.Wed, Mar 22, 8:15 AM

Added a comment explaining why this is necessary

This needs a much better comment explaining why this is necessary

Done.

I spent a little time experimenting with multiclassing the Thumb1 instructions to avoid optional defs;
it seems like it's viable, and it's substantially simpler in some ways, but it would take a lot of time to finish.

Yes, that's my own understanding, too.

javed.absar accepted this revision.Thu, Mar 23, 2:33 AM

Based on above discussions, LGTM.

This revision is now accepted and ready to land.Thu, Mar 23, 2:33 AM

Hi Tim,

James has suggested that you take a look, too.

LGTM from my side. But please wait for someone else to approve it too.