This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Relax ta/ma policy into tu/mu in InsertVSETVLI
Changes PlannedPublic

Authored by luke on Jul 26 2023, 5:12 AM.

Details

Summary

Currently, we only allow a vsetvli to be merged with another vsetvli if their
tail and mask policies are exactly the same.
However, an agnostic policy can be replaced with an undisturbed policy, since
it satisfies the requirement that the elements are either left unchanged (or
replaced with all ones).
This patch relaxes the requirement in areCompatibleVTYPEs to allow converting
ta -> tu and ma -> mu. This affects both the top-to-bottom and bottom-to-top
passes.

Diff Detail

Event Timeline

luke created this revision.Jul 26 2023, 5:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 26 2023, 5:12 AM
luke requested review of this revision.Jul 26 2023, 5:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 26 2023, 5:12 AM
luke added a comment.Jul 26 2023, 5:15 AM

This reduces the number of vsetvli toggles and improves code size at least, but I can't comment on the effect of undisturbed on register renaming etc.

This reduces the number of vsetvli toggles and improves code size at least, but I can't comment on the effect of undisturbed on register renaming etc.

This can create false dependencies on out of order CPUs that will prevent instructions from being executed until the previous writer of their destination register has completed.

I like the idea, but as @craig.topper says, this can be negative for OoO processors.
What about making this feature a SubtargetFeature and then processors can enable it depending on their implementation?
(and there won't be huge changes in tests)

luke planned changes to this revision.Jul 26 2023, 12:24 PM

This reduces the number of vsetvli toggles and improves code size at least, but I can't comment on the effect of undisturbed on register renaming etc.

This can create false dependencies on out of order CPUs that will prevent instructions from being executed until the previous writer of their destination register has completed.

I like the idea, but as @craig.topper says, this can be negative for OoO processors.
What about making this feature a SubtargetFeature and then processors can enable it depending on their implementation?
(and there won't be huge changes in tests)

I thought there might be a caveat as such, thanks for the explanation. I don't have any specific subtarget in mind for this, so I'm happy to remove it off of the review queue for now. We can always pick it up later if there's an in order processor that could utilise it

wangpc added subscribers: JojoR, zixuan-wu.EditedJul 26 2023, 7:45 PM

This reduces the number of vsetvli toggles and improves code size at least, but I can't comment on the effect of undisturbed on register renaming etc.

This can create false dependencies on out of order CPUs that will prevent instructions from being executed until the previous writer of their destination register has completed.

I like the idea, but as @craig.topper says, this can be negative for OoO processors.
What about making this feature a SubtargetFeature and then processors can enable it depending on their implementation?
(and there won't be huge changes in tests)

I thought there might be a caveat as such, thanks for the explanation. I don't have any specific subtarget in mind for this, so I'm happy to remove it off of the review queue for now. We can always pick it up later if there's an in order processor that could utilise it

As far as I know, THead XuanTie C908 core ([1, 2, 3]) is in-order and there is no register renaming I think.
CC @zixuan-wu @JojoR to see if they are interested in this.

  1. Product Introduction of C908, Chinese
  2. XuanTie C908: High-performance RISC-V Processor Catered to AIoT Industry
  3. T-Head XuanTie C908 RISC-V core targets AIoT applications