This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Macro fusion of simple ALU ops with branches for Broadcom's Vulcan
ClosedPublic

Authored by pgode on Jul 6 2016, 6:31 AM.

Details

Summary

Support for the macro fusion of simple ALU ops with branches for the Vulcan sub-target.

Patch by Meador Inge

Diff Detail

Event Timeline

pgode updated this revision to Diff 62853.Jul 6 2016, 6:31 AM
pgode retitled this revision from to [AArch64] Macro fusion of simple ALU ops with branches for Broadcom's Vulcan.
pgode updated this object.
pgode added reviewers: t.p.northover, rengolin.
pgode added subscribers: llvm-commits, meadori, echristo and 2 others.
rengolin added inline comments.Jul 6 2016, 8:18 AM
lib/Target/AArch64/AArch64InstrInfo.cpp
1806

This looks like a job for table-gen?

rengolin added inline comments.Jul 6 2016, 8:19 AM
lib/Target/AArch64/AArch64InstrInfo.cpp
1806

Also, please, don't add more getProcFamily calls, use sub-target features.

Redo the patch please and add llvm-commits as a subscriber before you
publish the patch.

pgode added inline comments.Jul 7 2016, 3:36 AM
lib/Target/AArch64/AArch64InstrInfo.cpp
1806

For Cyclone, only subset of 6 instruction fusion cases are applicable, but for Vulcan, additional instructions are applicable, which may not be applicable for Cyclone.
So, to generalize this under one 'Subtarget Feature', getProcFamily call seems unavoidable. Please suggest.

Or, Is a new subtarget feature, such as 'FeatureMacroOpFusionVulcanSubset' a better option?

Also, I am not sure about the table-gen option and how it can be done there.

rengolin added inline comments.Jul 7 2016, 3:50 AM
lib/Target/AArch64/AArch64InstrInfo.cpp
1806

This is not a Cyclone vs Vulcan, but some things are more profitable than others, and it's possible, as you said, that this is the case for other cores, including Cyclone.

A new target feature is needed, but one with "vulcan" in its name, because that would defeat the purpose. We are getting rid of things like that, ex. "isLikeA9" because it's a trap.

The table-gen option would be to add a property to those instructions and use a check on that property here. I'll pass on the benefits of that apporach vs. this (other people can chime in), but this is orthogonal to the target feature discussion.

Bottom line is: we don't want *more* CPU checks.

pgode added inline comments.Jul 8 2016, 12:27 AM
lib/Target/AArch64/AArch64InstrInfo.cpp
1806

The approach of adding a new sub-feature for Macro-op fusion, by categorizing the instructions (my presumption) doesn't seem a good option. It will end up adding too many subfeature such as FeatureMacroOpFusionArith/FeatureMacroOpFusionLogical. Please correct me.

I approached the table-gen option of adding instruction property, similar to adding CheapAsAMov property. But, in MCID Flags, there are already 32 flags, 'new flag MacroOpFusable' becomes the 33rd flag. As there might be future flags as well, which someone might add, so we should use it as a 64-bit.

I am thinking of submitting a 'new diff' on this review by just enabling 'FeatureMacroOpFusion' (AArch64.td file modification) for Vulcan and let only ADDS, SUBS, ANDS get fused (default Subtarget feature behavior) and work on table-gen part for complete solution. Please suggest.

pgode added a comment.Jul 8 2016, 2:14 AM

The approach of adding a new sub-feature for Macro-op fusion, by categorizing the instructions (my presumption) doesn't seem a good option. It will end up adding too many subfeature such as FeatureMacroOpFusionArith/FeatureMacroOpFusionLogical. Please correct me.

I approached the table-gen option of adding instruction property, similar to adding CheapAsAMov property. In MCID(MCInstrDesc) Flags, there are already 32 flags, 'new flag MacroOpFusable' becomes the 33rd flag. Though Flags is 'uint64_t', still I see a warning message 'left shift count >= width of type'.

I am thinking of submitting a 'new diff' on this review by just enabling 'FeatureMacroOpFusion' (AArch64.td file modification) for Vulcan and let only ADDS, SUBS, ANDS get fused (default Subtarget feature behavior) and work on table-gen part for complete solution. Please suggest.

rengolin edited edge metadata.Jul 8 2016, 2:35 AM

I approached the table-gen option of adding instruction property, similar to adding CheapAsAMov property. In MCID(MCInstrDesc) Flags, there are already 32 flags, 'new flag MacroOpFusable' becomes the 33rd flag. Though Flags is 'uint64_t', still I see a warning message 'left shift count >= width of type'.

Hum, that's not good. We'll have to think about many of them, if we can turn them into properties, rather than features. There were some that could, maybe we need a larger re-factor than I was expecting.

I am thinking of submitting a 'new diff' on this review by just enabling 'FeatureMacroOpFusion' (AArch64.td file modification) for Vulcan and let only ADDS, SUBS, ANDS get fused (default Subtarget feature behavior) and work on table-gen part for complete solution. Please suggest.

I think you're right. This is the pragmatic approach and will give us time to work out a better way forward.

Thanks!
--renato

pgode updated this revision to Diff 63193.Jul 8 2016, 3:34 AM
pgode edited edge metadata.

Updated diff.
Removed fusion of additional instructions for Vulcan.
Default instructions supported by 'Macrofusion subtarget feature' will be fused.

rengolin accepted this revision.Jul 8 2016, 3:46 AM
rengolin edited edge metadata.

Right, that looks good, thanks!

test/CodeGen/AArch64/misched-fusion.ll is already testing the flag itself, so we don't need additional tests.

cheers,
--renato

This revision is now accepted and ready to land.Jul 8 2016, 3:46 AM
This revision was automatically updated to reflect the committed changes.