This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Add missing operand update when copy is required
ClosedPublic

Authored by ehjogab on Nov 11 2020, 3:41 AM.

Details

Summary

When constraining an operand register using constrainOperandRegClass(),
the function may emit a COPY in case the provided register class does
not match the current operand register class. However, the operand
itself is not updated to make use of the COPY, thereby resulting in
incorrect code. This patch fixes that bug by updating the machine
operand accordingly.

Diff Detail

Event Timeline

ehjogab created this revision.Nov 11 2020, 3:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2020, 3:41 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
ehjogab requested review of this revision.Nov 11 2020, 3:41 AM

Is a test possible?

llvm/lib/CodeGen/GlobalISel/InstructionSelector.cpp
38

Does this need to call the change observer? Is a test possible?

ehjogab added inline comments.Nov 13 2020, 1:39 AM
llvm/lib/CodeGen/GlobalISel/InstructionSelector.cpp
38

I'm not familiar with the change observer, so I don't know whether it needs to be called. If you know, can you please inform me on how the change observer works and is intended to be used?

The test case requires a target where you have a register bank that can map to at least two register classes, and I haven't found any target with that. Due to how target is currently implemented, we use a single register bank as an intermediary step in porting our own target to global-isel, and that's when I stumbled upon this bug.

To convince oneself that this is a bug, look at how constrainOperandRegToRegClass() is called. There is only one place, in the implementation of GIR_ConstrainOperandRC in llvm/include/llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h. There, it does nothing to update the instruction in case a copy is emitted. Also, the documentation for constrainOperandRegToRegClass() says the following (see bold):

/// Constrain a register operand of an instruction \p I to a specified 
/// register class. This could involve inserting COPYs before (for uses) or
/// after (for defs) and **may replace the operand of \p I.**
/// \returns whether operand regclass constraining succeeded.
arsenm added inline comments.Jan 4 2021, 4:13 PM
llvm/lib/CodeGen/GlobalISel/InstructionSelector.cpp
38

Looking at how constrainOperandRegClass is implemented, I think the intent was it would handle all of the instruction modifications. It should be responsible for updating the operand, not the caller here (actually InstructionSelector::constrainOperandRegToRegClass seems like a fairly pointless wrapper function)

ehjogab added inline comments.Jan 4 2021, 11:01 PM
llvm/lib/CodeGen/GlobalISel/InstructionSelector.cpp
38

I find three places where constrainOperandRegClass() is called and its result is used to update the operand. So I disagree with the original intent being that constrainOperandRegClass() should also update the operand. There is also a constrainOperandRegClass() for SelectionISel and there the result is sometimes used to update the operand, so I believe that the behavior for GlobalISel was simply copied from how SelectionISel does it.

But I also agree that constrainOperandRegToRegClass() is quite useless, and since all cases where constrainOperandRegClass() is invoked the result is used to update the operand, maybe it is a better idea to let constrainOperandRegClass() do it for us. I will update the patch accordingly.

ehjogab updated this revision to Diff 314590.Jan 5 2021, 6:37 AM

Machine operand is now updated in constrainOperandRegClass()

ehjogab marked an inline comment as done.Jan 5 2021, 6:38 AM
arsenm added inline comments.Jan 5 2021, 1:16 PM
llvm/lib/CodeGen/GlobalISel/Utils.cpp
76–84

I think I misread this the first time through and thought this was replacing uses. However I'm still a bit confused.

This calls changingAllUsesOfReg after the potential change from constrainRegToClass (this will also happen even if constrainRegToClass ends up being a no-op when the register already was a compatible class). I'm further confused since the existing cases get away without calling the observer. I'm guessing this is because this is already called in contexts changing the instruction, or not bothering with the observer? If we don't need to call the observer here, I would prefer the first version but I'm unclear on whether it's really needed here at all

ehjogab added inline comments.Jan 6 2021, 11:50 PM
llvm/lib/CodeGen/GlobalISel/Utils.cpp
76–84

This functions may replace both uses as well as defs, and the bug this patch was originally intended to fix concerned lack of update for defs.

I agree that things are not clear with the observer is intended to be called and when it is strictly necessary to do so. It would be good to have someone with such knowledge to have a look at this patch.

@aditya_nandakumar Can you please take a look at this patch? We are unsure on whether the observer is used correctly here.

dsanders added inline comments.Jan 7 2021, 12:35 PM
llvm/include/llvm/CodeGen/GlobalISel/Utils.h
71–73

Just to mention it, there may be some consequences to adding 'or after' to this. I don't know if it's still the case or if we fixed them all but some passes used to be a bit picky on exactly where it was safe to insert instructions around the 'current' instruction and would fail to act on instructions or crash due to invalidated iterators or newly freed instructions.

llvm/lib/CodeGen/GlobalISel/InstructionSelector.cpp
38

I'm not familiar with the change observer, so I don't know whether it needs to be called. If you know, can you please inform me on how the change observer works and is intended to be used?

If you change the MIR (or are about to in some cases like erasingInstr()), you need to tell the observers. The observer is a way for something (typically the core loop of a pass) to keep track of what's going on so it can maintain it's data structures (e.g. avoid visiting deleted instructions), or report/analyze it. Other observers analyze lost DebugLoc's or simply print debugging output

Some observers can be fairly tolerant to failing to call them (especially the changingInstr/changedInstr functions as they were added later) but it's still a bug not to.

To convince oneself that this is a bug, look at how constrainOperandRegToRegClass() is called. There is only one
place, in the implementation of GIR_ConstrainOperandRC in
llvm/include/llvm/CodeGen/GlobalISel/InstructionSelectorImpl.h. There, it does nothing to update the instruction in
case a copy is emitted.

FWIW, it doesn't necessarily need to update the instruction immediately here but it's not doing the other method either. The other method is to have a subsequent GIR_* opcode that updates the uses but that would require us to record the result somewhere and we don't. I think the right method is to update as part of the GIR_ConstrainOperandRC as the other one will bloat the state machine a lot to handle a fairly rare case.

actually InstructionSelector::constrainOperandRegToRegClass seems like a fairly pointless wrapper function

Yep, it looks like it's no longer needed. A couple years back, tablegen emitted C++ directly and it was providing a lot of value for compile-time of the compiler itself. Eventually the compile-time of the generated code got excessive and we implemented the state machine

I find three places where constrainOperandRegClass() is called and its result is used to update the operand. So I disagree with the original intent being that constrainOperandRegClass() should also update the operand.

It's a bit of both really. The intent is for the caller to update the instruction it was working on and for the callee to update any other instructions that needed to be changed because of the constraint. The reason for the division is that the caller might be making other changes too that it either already has or is about to report. It's a bug to call changedInstr() and then make further changes without a new changingInstr() and it's undesirable to over-report changes by calling changingInstr()/changedInstr() multiple times for one change.

llvm/lib/CodeGen/GlobalISel/Utils.cpp
76–84

I've commented on this above (which I think is the same thread and phabricator has just failed to track it properly)

ehjogab added inline comments.Jan 7 2021, 11:23 PM
llvm/lib/CodeGen/GlobalISel/InstructionSelector.cpp
38

Thanks for your detailed reply, Daniel!

If you change the MIR (or are about to in some cases like erasingInstr()), you need to tell the observers. The observer is a way for something (typically the core loop of a pass) to keep track of what's going on so it can maintain it's data structures (e.g. avoid visiting deleted instructions), or report/analyze it. Other observers analyze lost DebugLoc's or simply print debugging output

Some observers can be fairly tolerant to failing to call them (especially the changingInstr/changedInstr functions as they were added later) but it's still a bug not to.

Okay, then I think I got the hang of it.

FWIW, it doesn't necessarily need to update the instruction immediately here but it's not doing the other method either. The other method is to have a subsequent GIR_* opcode that updates the uses but that would require us to record the result somewhere and we don't. I think the right method is to update as part of the GIR_ConstrainOperandRC as the other one will bloat the state machine a lot to handle a fairly rare case.

I think having a subsequent GIR_* opcode to do the updates is overkill, and would indeed needlessly bloat the state machine.

It's a bit of both really. The intent is for the caller to update the instruction it was working on and for the callee to update any other instructions that needed to be changed because of the constraint. The reason for the division is that the caller might be making other changes too that it either already has or is about to report. It's a bug to call changedInstr() and then make further changes without a new changingInstr() and it's undesirable to over-report changes by calling changingInstr()/changedInstr() multiple times for one change.

In light of this comment, then the correct way of handling this would be for constrainOperandRegClass() to insert the COPYs but not update the instruction itself (i.e. as it was before). I have, however, looked at the places where constrainOperandRegClass() is called and in none do they themselves invoke the observer, so it would be "okay" to move the operand update into constrainOperandRegClass() (i.e. as it is now in the patch). Also, it is more convenient to let constrainOperandRegClass() do the operand update since it must be done anyhow.

llvm/lib/CodeGen/GlobalISel/Utils.cpp
76–84

This calls changingAllUsesOfReg after the potential change from constrainRegToClass (this will also happen even if constrainRegToClass ends up being a no-op when the register already was a compatible class). I'm further confused since the existing cases get away without calling the observer. I'm guessing this is because this is already called in contexts changing the instruction, or not bothering with the observer? If we don't need to call the observer here, I would prefer the first version but I'm unclear on whether it's really needed here at all

I took a closer look and I believe that as it is now is correct:

  1. If the class is compatible, then we have only updated the register class. If the register is a use, then it means that we have also incurred a change that affects the instruction defining that register, and hence need to invoke changedInstr(); this we do. Lastly, we trigger changingAllUsesOfReg() to indicate a change that affects all uses of a register.
  2. If the class is not compatible, then the register is left unaffected and instead we introduce a new register along with a COPY. Independent of whether the register is a use or def, we affect no other instruction than the instruction that the operand belongs to. Hence we only need to invoke changedInstr().

The only potential bug here is that we invoke both changedInstr() and changingAllUsesOfReg() when the register is already in the correct class, and hence there is no update. I don't know whether that's a common case, though...

Any feedback on my latest comments?
Should the patch be accepted as is, or does it require more changes?

dsanders accepted this revision.Jan 19 2021, 2:31 PM

LGTM with the changingInstr() to match the changedInstr() on the setReg() path.

We ought to address the other issues too but they're pre-existing bugs so I don't want to block your patch on them. We can address them later

llvm/lib/CodeGen/GlobalISel/InstructionSelector.cpp
38

I agree that the operand update can be inside constrainOperandRegClass() but the changingInstr()/changedInstr() calls for that instruction need to either be outside or the internal calls should be optional if for when you're already in a changing/changed region.

I have, however, looked at the places where constrainOperandRegClass() is called and in none do they themselves invoke the observer

I'm surprised none of them call it but changed/changing was added much later than the others so I'm not surprised there's some omissions. At some point I need to run more passes with the lost DebugLoc observer and that will probably detect a fair few of them.

I think having a subsequent GIR_* opcode to do the updates is overkill, and would indeed needlessly bloat the state machine.

I agree

llvm/lib/CodeGen/GlobalISel/Utils.cpp
76–84

The only potential bug here is that we invoke both changedInstr() and changingAllUsesOfReg() when the register is already in the correct class, and hence there is no update. I don't know whether that's a common case, though...

We should probably fix that but I don't think we need to do it in this patch as it's not exactly wrong, it's just overreporting.

Along the same lines, this function generally seems to get changingInstr() wrong. For the new changedInstr() on the setReg() path it's missing and for the existing changingAllUsesOfReg() it's happening after the change it's reporting. It's supposed to happen before the change made by constrainRegToClass() if it opts to constrain using the same register. The setReg() path is easy to fix as we can just call changingInstr() just before the setReg() so I think we should go ahead and do that, but the changingAllUsesOfReg() is going to be much harder (although it shouldn't block your patch as it's pre-existing)

This revision is now accepted and ready to land.Jan 19 2021, 2:31 PM
ehjogab updated this revision to Diff 317785.Jan 19 2021, 11:47 PM

Added changingInstr() before setReg()

ehjogab marked an inline comment as done.Jan 19 2021, 11:47 PM

I'll submit this for @ehjogab in a little bit.

This revision was landed with ongoing or failed builds.Jan 20 2021, 1:34 AM
This revision was automatically updated to reflect the committed changes.