This is an archive of the discontinued LLVM Phabricator instance.

[RISCV][MC] Refine MCInstrAnalysis based on registers used
ClosedPublic

Authored by jobnoorman on Mar 20 2023, 10:04 AM.

Details

Summary

MCInstrAnalysis provides a number of methods to query properties of
instructions (e.g., isTerminator(), isCall(),...). The default
implementation of these methods forwards the query to MCInstrDesc which
returns information based on various RISCVInstrInfo*.td files.

Since the info in MCInstrDesc is based on opcodes only, it is often
quite inaccurate. For example, JAL/JALR are never recognized as
terminators or branches while they certainly can be. However,
MCInstrAnalysis has access to the full MCInst so can improve accuracy by
inspecting registers used by the instruction.

This patch refines the following MCInstrAnalysis methods:

  • isTerminator: JAL/JALR with RD=X0;
  • isCall: JAL/JALR with RD!=X0
  • isReturn: JALR/C_JR with RD=X0, RS1 in {X1, X5}
  • isBranch: JAL/JALR/C_JR with RD=X0, RS1 not in {X1, X5};
  • isUnconditionalBranch: JAL/JALR/C_JR with RD=X0, RS1 not in {X1, X5};
  • isIndirectBranch: JALR/C_JR with RD=X0, RS1 not in {X1, X5};

Note that the reason for this patch is to simplify the RISCV target in
BOLT. While it's possible to implement everything there, it seems more
logical to implement it directly in the RISCV backend as other tools
might also be able to benefit from it.

Diff Detail

Event Timeline

jobnoorman created this revision.Mar 20 2023, 10:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2023, 10:04 AM
jobnoorman requested review of this revision.Mar 20 2023, 10:04 AM
pcwang-thead added a comment.EditedMar 20 2023, 8:11 PM

Thanks! I think it's helpful. And I just have some questions.

llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCTargetDesc.cpp
168

What about call t0, OUTLINED_FUNCTION where register is RISCV::X5?

180

What about jalr x0, x5 which is used in outlined function?

jobnoorman added inline comments.Mar 21 2023, 3:21 AM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCTargetDesc.cpp
168

It's an interesting question whether we should handle outlined function calls here. I'd say it makes sense to do so but I would like to hear some more opinions about this.

We could also consider returning true for all JAL/JALR instructions where RD!=X0. The reasoning being that if the return address is stored anywhere, it's probably a function call.

180

If we decide to do the same for isCall, this could make sense for consistency. However, I'm a bit afraid that many indirect branches that happen to use X5 will be incorrectly recognized as returns.

Since both isCall and isReturn can only be estimates on RISC-V given just a single MCInst, the question is whether we want to over- or underestimate. I personally lean towards underestimating since returns are, in general, easy to detect on most targets. So giving a lot of false positives here might trip-up tools using this API.

asb added inline comments.Mar 27 2023, 1:55 AM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCTargetDesc.cpp
180

I'd lean towards being conservative in isReturn, and if a user needs to know if it might be a return, a new maybeReturn or similar method can be introduced (and implemented in terms of isReturn() || some_other_logic of course).

jobnoorman added inline comments.Mar 29 2023, 3:28 AM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCTargetDesc.cpp
180

That makes sense! Any thoughts on implementing isCall as "JAL/JALR and RD!=X0"?

asb added inline comments.Mar 30 2023, 3:08 AM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCTargetDesc.cpp
180

No strong opinion. I think as long as you properly document the limitations/assumptions being made, we're OK.

craig.topper added inline comments.Mar 30 2023, 10:04 PM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCTargetDesc.cpp
180

Due to this section of the RISC-V spec, the compiler will never use R5 for an indirect jump.

Return-address prediction stacks are a common feature of high-performance instruction-fetch units, but require accurate detection of instructions used for procedure calls and returns to be effective. For RISC-V, hints as to the instructions’ usage are encoded implicitly via the register numbers used. A JAL instruction should push the return address onto a return-address stack (RAS) only when rd is x1 or x5. JALR instructions should push/pop a RAS as shown in the Table 2.1.

We use GPRJALR as the register class on PseudoBRIND and PseudoCALLIndirect to avoid using R1 or R5.

jobnoorman updated this revision to Diff 511321.Apr 6 2023, 1:29 AM
jobnoorman edited the summary of this revision. (Show Details)

Take outlined function calls/returns into account:

  • isCall: true when RD!=X0 (this matches more than just outlined calls but it seems to make sense to treat any JAL/JALR that stores its return address as a call).
  • isReturn: True when RS1 is X1 (normal returns) or X5 (returns from outlined functions).
jobnoorman added inline comments.Apr 6 2023, 1:31 AM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCTargetDesc.cpp
180

Thanks, @craig.topper. I've updated the patch to take outlined functions into account.

craig.topper added inline comments.Apr 7 2023, 11:15 AM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCTargetDesc.cpp
196

Does this need to check !maybeReturnAddress so that isBranch doesn't return true for the same case as isReturn?

jobnoorman added inline comments.Apr 11 2023, 1:08 AM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCTargetDesc.cpp
196

It does! Will add in next update.

jobnoorman edited the summary of this revision. (Show Details)
  • Address reviewer comment
  • Add isIndirectBranch
asb added inline comments.Apr 11 2023, 4:30 AM
llvm/unittests/Target/RISCV/MCInstrAnalysisTest.cpp
64

Nit: I think for most tests in this file you actually want EXPECT_THAT (for which a test failure indicates a non-fatal error), as the following tests still make sense to run even if there was an earlier failure.

Address reviewer comment: replace ASSERT_THAT with EXPECT_THAT in tests.

jobnoorman marked 2 inline comments as done.Apr 11 2023, 5:42 AM

Friendly ping.

Friendly ping.

This revision is now accepted and ready to land.May 15 2023, 4:43 PM
barannikov88 added inline comments.
llvm/unittests/Target/RISCV/MCInstrAnalysisTest.cpp
2

Should tests contain a license header?

MaskRay added inline comments.May 15 2023, 6:48 PM
llvm/unittests/Target/RISCV/MCInstrAnalysisTest.cpp
40
MaskRay added inline comments.May 15 2023, 6:51 PM
llvm/unittests/Target/RISCV/MCInstrAnalysisTest.cpp
66

I think the idiom is EXPECT_TRUE and EXPECT_FALSE for functions that return bool.

EXPECT_THAT ..* IsTrue() is for a type that converts to a bool.

jobnoorman edited the summary of this revision. (Show Details)
  • Add license header
  • Remove anonymous namespace around functions, make static instead
  • Replace ASSERT_THAT(..., IsTrue()/IsFalse) with ASSERT_TRUE/FALSE
jobnoorman marked 3 inline comments as done.May 16 2023, 2:49 AM
MaskRay accepted this revision.May 16 2023, 11:29 AM
This revision was landed with ongoing or failed builds.May 17 2023, 2:47 AM
This revision was automatically updated to reflect the committed changes.
evandro removed a subscriber: evandro.May 17 2023, 3:51 PM