This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Optimize compare and branch sequence when the compare's constant operand is power of 2
ClosedPublic

Authored by bmakam on Mar 7 2016, 3:13 PM.

Details

Summary

Peephole optimization that generates a single TBZ/TBNZ instruction
for test and branch sequences like in the example below. This handles
the cases that miss folding of AND into TBZ/TBNZ during ISelLowering of BR_CC

Examples:

and  w8, w8, #0x400
cbnz w8, L1

to

tbnz w8, #10, L1

Diff Detail

Event Timeline

bmakam updated this revision to Diff 50001.Mar 7 2016, 3:13 PM
bmakam retitled this revision from to [Aarch64] Optimize test and branch sequence when the test's constant operand is power of 2.
bmakam updated this object.
bmakam added a subscriber: llvm-commits.
t.p.northover added inline comments.Mar 7 2016, 4:29 PM
include/llvm/Target/TargetInstrInfo.h
1109 ↗(On Diff #50001)

I think you're misinterpreting optimizeCondBranch it's for any conditional control flow at the end of a basic block, and in fact the AArch64 version already handles some TBZ cases.

So I think the logic of this function should be folded into the other (as neatly as possible).

bmakam updated this revision to Diff 50038.Mar 8 2016, 5:29 AM

Address Tim's comment.

bmakam retitled this revision from [Aarch64] Optimize test and branch sequence when the test's constant operand is power of 2 to [AArch64] Optimize test and branch sequence when the test's constant operand is power of 2.Mar 8 2016, 5:30 AM

Thanks for the review Tim,

I have folded the logic into optimizeCondBranch as you suggested. Does this look reasonable?

mcrosier edited edge metadata.EditedMar 8 2016, 6:10 AM

Please run clang-format on your patch, Balaram.

lib/Target/AArch64/AArch64InstrInfo.cpp
2956

s/test/compare/

I. e., Replace compare and branch sequence by..?

3015

Please add a period. It might also be helpful to comment your converting this sequence to tbz/tbnz when the AND is a constant power of 2.

3029

Might be easier to read as: Opc != ANDW && Opc != ANDX. This should also remove the unneeded parens.

3052

Should you be checking the Imm value here? Isn't it possible to have an Imm <= 31, but still be defining a 64bit register?

I'm wondering if the condition should be something more like (AndDefMI->getOpcode() == AArch64::ANDXri), as is done above?

3054

Single statement; No need for curly brackets.

You could also use a select statement.

3061

Same.

t.p.northover added inline comments.Mar 8 2016, 9:57 AM
lib/Target/AArch64/AArch64InstrInfo.cpp
3016

This looks overly specialized to me: "look through precisely one COPY, but don't accept a direct AND".

Shouldn't we make the look-through-copy logic generic (so it applies to the CSINC/CBZ too), and iterative?

3073

I'd be inclined to make this whole thing a switch on DefMI->getOpcode() once it's been found. They're handled more like sibling cases anyway, and I could easily see us adding other patterns later.

bmakam marked 5 inline comments as done.Mar 9 2016, 6:28 AM

Thanks for the comments Chad and Tim,
I will post an updated patch including all your suggestions.

lib/Target/AArch64/AArch64InstrInfo.cpp
3016

Thanks Tim, this looks reasonable to me. I will post an updated patch including this suggestion.

3052

The immediate is the bit that is tested whether it is set or not, so if it is less than 32 and x[0-9]+, #c is same as w[0-9]+, #c. Furthermore the integrated linker treats and x[0-9]+, #c differently and transforms it into and x[0-9]+, 32+#c if c <32. I think this is a bug in the integrated linker, if we use no-integrated-linker and x[0-9]+, #c is transformed into and w[0-9]+, #c automatically. I therefore am checking the immeadiate value to avoid dependency on the underlying linker.

3073

Sounds reasonable to me again, will do in updated patch. Just out of curiosity, what other patterns do you think can be added?

t.p.northover added inline comments.Mar 9 2016, 7:02 AM
lib/Target/AArch64/AArch64InstrInfo.cpp
3073

The one I thought of the other day was MOV/CBZ -> B. "ORR #imm" too. I've no idea if these occur enough to be worthwhile though, so no need to add them unless you actually want to.

bmakam updated this revision to Diff 50290.Mar 10 2016, 9:31 AM
bmakam retitled this revision from [AArch64] Optimize test and branch sequence when the test's constant operand is power of 2 to [AArch64] Optimize compare and branch sequence when the compare's constant operand is power of 2.
bmakam edited edge metadata.

Address review comments.

t.p.northover accepted this revision.Mar 10 2016, 9:34 AM
t.p.northover edited edge metadata.

Thanks for updating the patch. This looks good to go now.

Tim.

This revision is now accepted and ready to land.Mar 10 2016, 9:34 AM
This revision was automatically updated to reflect the committed changes.
chapuni added inline comments.
llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp
2980 ↗(On Diff #50295)

2nd "\brief" should not be here. [-Wdocumentation]

mcrosier added inline comments.Mar 21 2016, 6:49 AM
llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp
2980 ↗(On Diff #50295)

Committed r263942.