This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Place jump table as the first operand in additions
ClosedPublic

Authored by chill on Nov 7 2017, 11:14 AM.

Details

Summary

When generating table jump code for switch statements, place the jump
table label as the first operand in the various addition instructions in
order to enable addressing mode selectors to better match index
computation and possibly fold them into the addressing mode of the table
entry load instruction.

Diff Detail

Repository
rL LLVM

Event Timeline

chill created this revision.Nov 7 2017, 11:14 AM

I understand what you're trying to do with the changes to ARMISelLowering.cpp make sense, but I'm surprised our address-matching code doesn't try to commute the operands. Do we really not have code to do that somwhere?

I'm not sure what the change to ARMAsmPrinter.cpp is supposed to do; is it buggy, and nobody noticed because it never triggers?

I guess the change to ARMConstantIslandPass.cpp is just to match the new pattern we generate in Thumb1 mode?

Would this code be less confusing if we address the FIXME in ARMInstrInfo.td about using "addrmode2" in BR_JTm?

chill added a comment.EditedNov 8 2017, 12:39 AM

I understand what you're trying to do with the changes to ARMISelLowering.cpp make sense, but I'm surprised our address-matching code doesn't try to commute the operands. Do we really not have code to do that somwhere?

I prefer to have any kind of IR generated or transformed into some (more or less) canonicalised form, to reduce the number of cases that need to be matched. For such indexed loads
it's fairly typical and natural to shift/multiply the index, instead of the base address.

I'm not sure what the change to ARMAsmPrinter.cpp is supposed to do; is it buggy, and nobody noticed because it never triggers?

Yes, the LDRrs should have the destination reg, the base reg, the index reg, and the shift amount. I'll see about adding a testcase for this.

I guess the change to ARMConstantIslandPass.cpp is just to match the new pattern we generate in Thumb1 mode?

It matches the different order of operands generated in ARMISelLOwering.cpp, ARMTargetLowering::LowerBR_JT.

Would this code be less confusing if we address the FIXME in ARMInstrInfo.td about using "addrmode2" in BR_JTm?

I have a patch for that too, will sent it shortly. The two changes are don't conflict or make one another redundant, IIRC.

chill added a comment.Nov 8 2017, 11:03 AM

I understand what you're trying to do with the changes to ARMISelLowering.cpp make sense, but I'm surprised our address-matching code doesn't try to commute the operands. Do we really not have code to do that somwhere?

I prefer to have any kind of IR generated or transformed into some (more or less) canonicalised form, to reduce the number of cases that need to be matched. For such indexed loads
it's fairly typical and natural to shift/multiply the index, instead of the base address.

I'm not sure what the change to ARMAsmPrinter.cpp is supposed to do; is it buggy, and nobody noticed because it never triggers?

Yes, the LDRrs should have the destination reg, the base reg, the index reg, and the shift amount. I'll see about adding a testcase for this.

Now that I looked closer, yes, this case didn't trigger, because we haven't had a shift amount there until this patch. Without this change
the compiler emits, e.g. ldr pc, [r1, r0] instead of ldr pc, [r1, r0, lsl #2]. The test test/CodeGen/ARM/arm-position-independence-jump-table.ll
tests this code, so no additional test needed.

I prefer to have any kind of IR generated or transformed into some (more or less) canonicalised form, to reduce the number of cases that need to be matched. For such indexed loads it's fairly typical and natural to shift/multiply the index, instead of the base address.

Sure... but we definitely have code in ARMDAGToDAGISel::SelectAddrMode2Worker which tries to commute the base and offset. I'd like to understand why it isn't triggering. It's not a big deal to explicitly commute the operands here, but the same issue could affect other cases where the addition comes out of user code.

Just took a quick look with a debugger; it looks like the code never triggers because of the hasOneUse() check?

chill added a comment.Nov 9 2017, 12:40 AM

I prefer to have any kind of IR generated or transformed into some (more or less) canonicalised form, to reduce the number of cases that need to be matched. For such indexed loads it's fairly typical and natural to shift/multiply the index, instead of the base address.

Sure... but we definitely have code in ARMDAGToDAGISel::SelectAddrMode2Worker which tries to commute the base and offset. I'd like to understand why it isn't triggering. It's not a big deal to explicitly commute the operands here, but the same issue could affect other cases where the addition comes out of user code.

Just took a quick look with a debugger; it looks like the code never triggers because of the hasOneUse() check?

Yes, that's right. That check was added by the following commit

https://github.com/llvm-mirror/llvm/commit/f40deed62f4f0126459ed7bfd1799f4e09b1aaa7#diff-c3cb15ba0731744ba5fbdc0ba6551c2eL498

I can't quite understand the logic there - certainly it would make more sense to fold the shift into the addressing
mode when shift node *has* one use ?

efriedma accepted this revision.Nov 9 2017, 12:27 PM

LGTM.

I can't quite understand the logic there - certainly it would make more sense to fold the shift into the addressing mode when shift node *has* one use ?

I think that was the intent... but it got mixed up, and nobody caught it because it's not something which would trigger often anyway (since GEPs are lowered with the pointer base on the LHS). It is possible to show the general missed optimization with C code like the following: "int a(int x, int y) { return *((char*)(x*4)+y); }". But I'm fine leaving that for a separate patch.

This revision is now accepted and ready to land.Nov 9 2017, 12:27 PM
This revision was automatically updated to reflect the committed changes.