This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Introduce a flag to allow same cycle def-use schedule
Needs ReviewPublic

Authored by ssarda on Oct 16 2022, 8:37 PM.

Details

Summary

Some targets may allow def-use scheduling within same cycle. Introduce a flag in Pipeliner to allow the same.

Diff Detail

Event Timeline

ssarda created this revision.Oct 16 2022, 8:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 16 2022, 8:37 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
ssarda requested review of this revision.Oct 16 2022, 8:37 PM

The test in isValidSchedule isn't there to prevent use/def in the same cycle due to assumed limitations in targets; in general, we do want to allow use/def in the same cycle if that's what the scheduling model has led us to. Instead, it's there to work around misoptimizations caused by the fact that while orderDependence is responsible for making sure that use/def are properly ordered within a single cycle, orderDependence doesn't attempt to do so when physical registers are involved.

So I worry that Hexagon working when the test is elided is simply a happy side-effect of its current scheduling models, and not due to a guarantee that def/use ordering cannot somehow get violated when modulo scheduling is performed for it. Of course, before I added the test, Hexagon was presumably not having problems, so maybe there is something special about it, in which case this change seems reasonable -- though perhaps with a commit message with a bit more explanation (e.g. that this only affects physical registers).

I do think a more robust solution, likely also improving scheduling for multiple targets, would be to get orderDependence to properly handle def/use ordering when there's a physical register, though I don't know how difficult that might be. I'm adding the (assumed) original author of orderDependence to the reviewer list to see if they have more insight.

I agree that orderDependence() need to be more robust. However, we can't penalize pipeliner scheduling where def-use are in same cycle. I may add a TODO as of now and keep the flag. Once orderDependence() is fixed, we may remove the flag. Does it sound good?

I agree that orderDependence() need to be more robust. However, we can't penalize pipeliner scheduling where def-use are in same cycle. I may add a TODO as of now and keep the flag. Once orderDependence() is fixed, we may remove the flag. Does it sound good?

At the moment we *must* penalize pipeliner scheduling in this case in general, because it will sometimes reverse def-use pairs for physical registers. I suspect that Hexagon was happy with this for so long because there's something in its scheduling model which is just preventing this from happening. That is not true for other targets. (ARM is where I discovered it.)

In the meanwhile, a TODO explaining what needs to be done and that the flag would need to be removed makes sense. It also might be good to make the flag name a bit more explicit about what we're assuming here ... "AssumePhysicalDefUseDontReorderWhenModuloScheduled", is probably a bit long, but it highlights the assumption being made. I'd like to avoid someone down the line saying "look, there's a flag that might improve performance!" without being warned that it might also break correctness.

Hi @ssarda, I assume the case in question is for a .new use on a physical register? It sounds like the issue is that the use appears before the definition in the linear list of instructions after the pipeliner because orderDependences doesn't order them properly? If so, that's a bug. I'm surprised that doesn't cause other issues. For example, there is no guarantee that the use will appear in the same packet as the def. if they aren't ordered correctly.

Hi @bcahoon, i am yet to come across a case in Hexagon where in a linear list of instruction, a use occurs before def in orderDependence(). I assume, that's never a case for Hexagon happening yet. As pointed out by @dpenry, orderDependence() is not robust for other targets to make sure that def-use are in proper order, because of which we are trying to avoid same cycle scheduling by having check in isValidSchedule() function. But since, this is not the case with Hexagon target, this patch enables same cycle def-use scheduling for only Hexagon with a flag. Let us know your thoughts over it.

Hi @ssarda, can you share a test case that is fixed by your patch?

ssarda updated this revision to Diff 475090.Nov 14 2022, 2:42 AM

@bcahoon attached a test case

Hi @ssarda, thanks for the test case. That helps to understand the problem. Which appears to be the following MIR,

INLINEASM &"$0 = insert($1,#31,#1);" [attdialect], $0:[regdef], implicit-def $r7, $1:[reguse:IntRegs], %45:intregs
%32:intregs = COPY $r7
$r7 = COPY %32:intregs

These instructions are scheduled in the same cycle. Then, isValidSchedule rejects this because :

// Furthermore, if a physical def/use pair is assigned to the same
// cycle, orderDependence does not guarantee def/use ordering, so that
// case should be considered invalid.

That means that orderDependence could reorder those instructions after pipelining?

Hi @ssarda, thanks for the test case. That helps to understand the problem. Which appears to be the following MIR,

INLINEASM &"$0 = insert($1,#31,#1);" [attdialect], $0:[regdef], implicit-def $r7, $1:[reguse:IntRegs], %45:intregs
%32:intregs = COPY $r7
$r7 = COPY %32:intregs

These instructions are scheduled in the same cycle. Then, isValidSchedule rejects this because :

// Furthermore, if a physical def/use pair is assigned to the same
// cycle, orderDependence does not guarantee def/use ordering, so that
// case should be considered invalid.

You are right @bcahoon.

That means that orderDependence could reorder those instructions after pipelining?

For Hexagon, i haven't come across a case yet where the orderDependence is re-ordering instructions in such a way that use is scheduled before def.
Moreover, if there is a problem in orderDependence not guaranteeing def scheduled before use, then the fix should go in orderDependence() function.
I feel that by putting stricter checks in isValidSchedule() function for same cycle, we are penalizing pipeliner resulting in un-pipelined code.