Fixes PR35023.
Details
Diff Detail
Event Timeline
| lib/CodeGen/ExpandISelPseudos.cpp | ||
|---|---|---|
| 75 ↗ | (On Diff #179002) | I think the logic needed here is a little more complicated. Fortunately, someone else has already been scarred for life by working it out (I expect) and you can just call addLiveIns from LivePhysRegs.h | 
| lib/CodeGen/ExpandISelPseudos.cpp | ||
|---|---|---|
| 75 ↗ | (On Diff #179002) | Actually, shouldn't whatever's doing the splitting update liveness too? For all ExpandISelPseudos knows there could be a dozen new blocks scattered across the function that need updating. | 
| lib/CodeGen/ExpandISelPseudos.cpp | ||
|---|---|---|
| 75 ↗ | (On Diff #179002) | +1 | 
| lib/CodeGen/ExpandISelPseudos.cpp | ||
|---|---|---|
| 75 ↗ | (On Diff #179002) | 
 We could push the responsibility to the lowering code, but it would mean that we have to go through all lowering code. Doing it in ExpandISelLowering potentially misses cases as well though. I'll take a look at the lowering code, a common pattern seems to be calling MBB->splice & MBB->transferSuccessorsAndPHIs in the lowering code (I checked ARM and X86). In some cases, there is manual handling for CPSR already, in some cases (like COPY_STRUCT_BYVAL_I32) it is missing. | 
Use LivePhysRegs for liveins computation, pushed code to ARMISelLowering, handled all cases I could spot there. Adding Eli, as he fixed a similar problem in D54192, which I think should also be handled by the general liveins computation.
I'm a little concerned computeAndAddLiveIns is more expensive than checkAndUpdateCPSRKill. In theory, they're both quadratic, but in practice I would expect checkAndUpdateCPSRKill is much cheaper: it only tracks one register, and quits immediately when it finds a def of CPSR.
Right, but don't we have to handle all relevant physical registers, e.g. if we would have a use of a physical register after tMOVCCr_pseudo in test_slt_not ? It might not be very likely in practice, but it seems like there is nothing preventing that from happening (it is quite easy to construct a MIR test that fails the machine verifier after expand-isel-pseudos). But maybe I am missing some constraint here?
The way we iterate over the instructions in ExpandISelPesudos also means that we end up doing more work than necessary if we have multiple pseudos introducing new MBBs in the original MBB, but we might be able to visit instructions in reverse order or something like that.
The interesting constraint here is that ExpandISelPesudos runs immediately after ISel. And if you have an instruction which produces physical registers from an allocatable register class, ISel will COPY the values to virtual registers immediately after that instruction (and similarly for instructions that consume physical registers). So in practice, on ARM, the only physical register with a non-trivial live interval in ExpandISelPesudos is CPSR.
Granted, that's a bit fragile, and it isn't formally enforced anywhere.
Thanks Eli, I guess we can just go with only handling CPSR. I will update the patch!
| lib/Target/ARM/ARMISelLowering.cpp | ||
|---|---|---|
| 9009 ↗ | (On Diff #179203) | Hang on... doesn't the loop clobber CPSR? I think rather than add the live-in marking like this, we need to mark COPY_STRUCT_BYVAL_I32 with "Defs = [CPSR]". | 
Could we just say that ExpandISelPseudo doesn't preserve the liveness and have it be recomputed when needed?
Basically, is the complexity we add here worth the compile time we save?
In particular, I am worried that we will do a lot of recomputations if there are a lot of copy_struct_by_val operations around, whereas we would do only one by recomputing the liveness lazily.
| lib/Target/ARM/ARMISelLowering.cpp | ||
|---|---|---|
| 9009 ↗ | (On Diff #179203) | Yes, at least when not in Thumb1, we define CPSR in the loop. I guess something could still potentially use CPSR after the COPY_STRUCT_BYVAL, then we would still need it in the live-ins. Not sure if that would make sense though. | 
| lib/Target/ARM/ARMISelLowering.cpp | ||
|---|---|---|
| 9009 ↗ | (On Diff #179203) | Presumably nothing is ever going to use the CPSR def of a COPY_STRUCT_BYVAL... we would have to add explicit logic for that, and the loop sets CPSR to a fixed value which isn't generally useful. | 
| lib/Target/ARM/ARMISelLowering.cpp | ||
|---|---|---|
| 9009 ↗ | (On Diff #179203) | Right thanks, so for now I guess we can just mark it with Defs = [CPSR]. |