This is an archive of the discontinued LLVM Phabricator instance.

[Kryo] Use immediate #0 to zero a register
ClosedPublic

Authored by haicheng on May 5 2016, 10:57 AM.

Details

Summary

Similar to Cyclone, Kryo favors using immediate #0 to zero out registers.

Diff Detail

Repository
rL LLVM

Event Timeline

haicheng updated this revision to Diff 56309.May 5 2016, 10:57 AM
haicheng retitled this revision from to [Kryo] Use immediate #0 to zero a register.
haicheng updated this object.
haicheng added reviewers: mcrosier, gberry, mssimpso.
haicheng set the repository for this revision to rL LLVM.
haicheng updated this object.May 5 2016, 10:58 AM
haicheng updated this object.May 5 2016, 11:06 AM
mcrosier edited edge metadata.May 5 2016, 11:13 AM

I'd replace "HasImmeZeroing" with "UseImmZeroing".

lib/Target/AArch64/AArch64.td
52 ↗(On Diff #56309)

s/imme/imm

54 ↗(On Diff #56309)

s/great/preferred

56 ↗(On Diff #56309)

s/imme/imm

57 ↗(On Diff #56309)

s/immez/immz

151 ↗(On Diff #56309)

s/Imme/Imm

lib/Target/AArch64/AArch64Subtarget.h
139 ↗(On Diff #56309)

s/has/use

haicheng updated this revision to Diff 56316.May 5 2016, 11:13 AM
haicheng edited edge metadata.
haicheng added subscribers: llvm-commits, t.p.northover.

Change one line of the comment.

haicheng updated this revision to Diff 56320.May 5 2016, 11:29 AM
haicheng marked 6 inline comments as done.

Addresses Chad's comments.

@t.p.northover: Much like Cyclone, Kryo prefers the move immediate zero idiom for zeroing registers. Unfortunately, when we tried to reuse the ZeroCycleZeroing target feature we found regressions in some of our FP workloads. The issue is described in PR27454. Do you see a way we could have a unified approach for Cyclone and Kryo that avoids the issue in PR27454?

lib/Target/AArch64/AArch64InstrInfo.cpp
1932 ↗(On Diff #56320)

If I'm not mistaken, Matt suggested this be Subtarget.isKryo(), rather than adding a new target feature.

haicheng updated this revision to Diff 57866.May 19 2016, 3:22 PM
haicheng updated this object.

Change to use subtarget hook. Enable float point zeroing as well.

haicheng updated this revision to Diff 62990.Jul 6 2016, 4:15 PM
haicheng updated this object.
haicheng added a subscriber: MatzeB.

Now that r274686 is committed, I can simply enable zczeroing feature in Kryo for both integer and float registers. I have to explicitly set the isAsCheapAsAMove feature because Kryo has customized CheapAsMoveHandling.

MatzeB added a comment.Jul 6 2016, 4:42 PM

Looks good to me, but I believe the test can be simplified:

test/CodeGen/AArch64/kryo_zeroing.ll
26–45 ↗(On Diff #62990)

These tests appear to have a lot more instructions than necessary just to produce mov x9, #0; mov x8, #0. Also wouldn't it be better to merge this with arm64-zero-cycle-zeroing.ll, maybe you can just addnother run line there?

haicheng updated this revision to Diff 63008.Jul 6 2016, 6:19 PM
haicheng marked an inline comment as done.

As suggested by Matthias, add another run line in arm64-zero-cycle-zeroing.ll

MatzeB added inline comments.Jul 6 2016, 7:26 PM
lib/Target/AArch64/AArch64InstrInfo.cpp
629–632 ↗(On Diff #63008)

Shouldn't this rather be "case AArch64::COPY: return true"?

The whole purpose of this function is that some passes can decide whether to copy a value around with COPY or just recompute the value instead.

haicheng added inline comments.Jul 6 2016, 7:44 PM
lib/Target/AArch64/AArch64InstrInfo.cpp
629–632 ↗(On Diff #63008)

I agree with you. I don't know why COPY is not included in the first place. However, can I do it in a separate patch because it affects other subtargets? This patch is supposed to affect Kryo only.

haicheng updated this revision to Diff 63597.Jul 11 2016, 3:25 PM

I separate this patch into two parts, ZCZeroing part and isAsCheapAsAMove part. This patch now only has the first part.

I am still collecting performance data of setting any COPY isAsCheapAsAMove on A57 and Kryo which may takes a while. Moreover, I am thinking maybe I should use isZeroCost() API to set all zero cost pseudo-instructions (including COPY) isAsCheapAsAMove. I will upload another patch after the experiment is done.

MatzeB accepted this revision.Jul 11 2016, 3:32 PM
MatzeB added a reviewer: MatzeB.

LGTM.

This revision is now accepted and ready to land.Jul 11 2016, 3:32 PM

Thank you, Matthias.

Another thing about Cyclone's ZCZeroing feature is AArch64InstrInfo::isGPRZero() which I believe is only used by Cyclone. This function returns true for COPY WZR but not COPY XZR. I don't know if it is intentional or a mistake.

This revision was automatically updated to reflect the committed changes.