This is an archive of the discontinued LLVM Phabricator instance.

[XRay] Add support for RISCV
Needs ReviewPublic

Authored by ashwin98 on Jan 21 2022, 2:39 PM.

Details

Summary

Adds XRay support for RISCV. Currently, only RISCV64 has been tested. Changes required to add support for RISCV32 are present, but are commented out.

The modifications (as in the case of RISCVAsmPrinter) were made with the intention of making as few changes/touching as few files as possible to reduce the possibility of breaking any other functionality.

The double precision floating point extension was assumed to be part of the ISA while designing the trampolines (and this has been enforced in the Subtarget file). This implementation currently does not support compressed instructions.

Diff Detail

Event Timeline

ashwin98 created this revision.Jan 21 2022, 2:39 PM
ashwin98 requested review of this revision.Jan 21 2022, 2:39 PM
Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptJan 21 2022, 2:39 PM
Herald added subscribers: llvm-commits, Restricted Project, cfe-commits and 2 others. · View Herald Transcript
  1. Please upload patches with full context
  2. You should not need to have separate xray_riscv32/64.cpp, a single shared file with a small amount of XLEN-based conditions should suffice and reduce a whole load of code duplication. Possibly also applies to the trampoline assembly but maybe not, there are lots of constants and lw/ld's in there... though would be nice if that were macro'd so they don't get out of sync
  3. Why comment out riscv32? At least uncomment everything except the one CMake place that says it exists, having ~10 different places with commented-out code is ugly.
llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
260

# before immediates is an AArch64-ism, on RISC-V the syntax is just the plain integer

261

This doesn't match what you actually do

272

addExpr seems fine? That's what you use for an MCSRE. Don't see why it needs commentary here.

276

?

284

Indentation is off

jrtc27 added inline comments.Jan 21 2022, 4:29 PM
compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
77

Does that actually only comment out RISCV32? Phabricator's syntax highlighting thinks not, but it could just be overly simplistic.

compiler-rt/lib/xray/xray_riscv32.cpp
11 ↗(On Diff #402107)

You sure about that 64-bit?

24 ↗(On Diff #402107)

Whitespace here is a mess

33 ↗(On Diff #402107)

Register names have an x prefix not an r prefix

96 ↗(On Diff #402107)

Why are these comments way off to the right like that? This is borderline unreadable.

101 ↗(On Diff #402107)

No. %hi(0x800) *is* 1. Having the if/else results in double-counting were this to be treated as actual assembly. The +1 is part of how hi/lo relocation pairs are defined.

127 ↗(On Diff #402107)

is how you implement hi/lo pairs in a branchless manner, exploiting carry propagation

llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
255

These magic numbers need explaining

257

This sentence ends abruptly (and why is the paragraph wrapped at such a tiny character count?)

llvm/test/CodeGen/RISCV/xray-attribute-instrumentation.ll
2

-filetype=asm is redundant, as is -o -, and you should probably be passing -verify-machineinstrs like our other tests.

Can this be made to work with update_llc_test_checks.py? My guess is probably not due to the filtering it does, but it would be nice if it can be.

7

Do you really need all of those attributes? I highly doubt noinline is needed, sane for uwtable. I don't think you'll get CFI either without nounwind, but could be wrong, never quite know when it's needed to squash CFI.

67

Not sure what this {{.*}} achieves

69

Given you've already explicitly stated the label names in the CHECK lines for the function body it would seem prudent to verify the right ones are used here

ashwin98 added a comment.EditedJan 22 2022, 6:53 AM

Thank you for your feedback! I could combine the riscv32 and 64 cpp files with some xlen conditions if that will work better, but that might take a bit of a hit in terms of readability (do I explain both sleds in the comments preceding the implementation). I have commented riscv32 out because I haven't managed to test it out yet, I've had some difficulty getting llvm set up for riscv 32 - I will leave riscv32 commented out only in the cmake file.

I wasn't too sure about how to work around sign extension in RISCV, which you have picked up on - adding 0x800 wasn't something I thought of. Thinking about it a bit more, it makes sense, we're not adding 4096 like how I was, though it has the same effect; I'll reason it out, I'm sure carry propagation deals with it like you've mentioned. I'll update the code to reflect the same.

I had a related question with respect to the 64 bit sleds though - given that lui is also sign extended, we need a work around for it as well while constructing the 32 bit values, and while combining the 2 32 bit values into a 64 bit value. I have currently been getting rid of the upper 32 bits by performing a left shift followed by a right shift, but I'm sure there is a better solution to it.

Thank you for your feedback! I could combine the riscv32 and 64 cpp files with some xlen conditions if that will work better, but that might take a bit of a hit in terms of readability (do I explain both sleds in the comments preceding the implementation).

In one sense yes it will be slightly less readable. In another sense it actually makes it more readable, because seeing the XLEN-based conditions makes it clear what things are word-sized and what things are explicitly 32-bit (in the RV32 code any LW is unclear whether it's loading an int, a size_t or a void *). And yes, you explain both, but most of it is the same so can be combined into a single explanation, e.g. like LLD does for documenting its PLT stubs in lld/ELF/Arch/RISCV.cpp.

I wasn't too sure about how to work around sign extension in RISCV, which you have picked up on - adding 0x800 wasn't something I thought of. Thinking about it a bit more, it makes sense, we're not adding 4096 like how I was, though it has the same effect; I'll reason it out, I'm sure carry propagation deals with it like you've mentioned. I'll update the code to reflect the same.

It's important that it's only added when computing the HI relocation. As an example, %hi(0x81734) (to pick a number at random that's not too boring) would be (0x81734 + 0x800) >> 12 = 0x81f34 >> 12 = 0x81, whereas %hi(0x81934) = (0x81934 + 0x800) >> 12 = 0x82134 >> 12 = 0x82. You can see how the adding 1 << 11 combined with right-shifting by one more results in adding one to the upper 20 bits if and only if bit 11 of the input is 1; if it's 0 there is no carry out so the only bit that's modified is bit 11, which the right shift will shift out.

I had a related question with respect to the 64 bit sleds though - given that lui is also sign extended, we need a work around for it as well while constructing the 32 bit values, and while combining the 2 32 bit values into a 64 bit value. I have currently been getting rid of the upper 32 bits by performing a left shift followed by a right shift, but I'm sure there is a better solution to it.

That's one way of doing it, though requires more than one temporary register. RISCVMatInt's generateInstSeqImpl has an alternate sequence documented for the general case (as well as various optimised special cases) that is the same number of instructions but only needs one register. If you have multiple registers then your sequence probably performs better on an out-of-order core. Synthesising 64-bit addresses is pretty inefficient; you might prefer instead just loading from a constant pool adjacent to the code.

ashwin98 updated this revision to Diff 402533.Jan 24 2022, 8:05 AM

Updated the diff, made the following changes:

  1. Merged the riscv files into xray_riscv.cpp and removed the if-else code for %hi()
  2. Cleaned up the issues related to indenting and comments in RISCVAsmPrinter.cpp
  3. Updated the test file to pass -verify-machineinstrs and remove unnecessary attributes as well as {{.*}}s
  4. Fixed riscv32 comments - it is now only commented out in cmake/Modules/AllSupportedArchDefs.cmake

I have been testing this patch on qemu using ubuntu for riscv64, the comment that Phabricator detects in the supported architecture definitions cmake file is probably an issue with syntax highlighting. Nevertheless, we could instead comment out riscv32 in clang/lib/Driver/XRayArgs, which would also throw up an error during compilation stating that the target is not supported.

dberris requested changes to this revision.Jan 25 2022, 12:36 AM

It looks like you'll need to address the lint issues (using clang-format).

I'm not an expert on RISCV assembly so you might need to get someone familiar with the ISA reviewing this too.

For tests, I recommend also adding some for the tooling that consumes the sections of the binary where we expect metadata that XRay expects to be present. You should be able to re-use/extend some of the tests we already have for those to see the end-to-end result (compiling and linking a binary which has Xray instrumentation sleds and associated maps).

This revision now requires changes to proceed.Jan 25 2022, 12:36 AM
ashwin98 updated this revision to Diff 403331.Jan 26 2022, 10:36 AM

Made changes to handle lint issues.

ashwin98 updated this revision to Diff 403707.Jan 27 2022, 9:59 AM

Fixed another lint issue, they should all be done for now hopefully.

@dberris Sorry, I'm a little confused, do you mean I need to update some of the clang tests for XRay instrumentation to test compilation and linking, or add new ones for the llvm-xray tool under llvm/test/tools?

jrtc27 added inline comments.Jan 27 2022, 10:49 AM
compiler-rt/lib/xray/xray_riscv.cpp
123–124

Why? You shift this whole thing left by 32 later

128–129

You might be able to avoid this by adding 0x80000800 before computing "%higher" and "%highest" (feels rather MIPSy... not official things)? Not sure, would need to think this through more, but it feels like it should be possible...

163

This is definitely wrong; you probably mean (((TracingHook >> 32) + 0x800) >> 12) & 0xfffff?

llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
105–106

"Keeping changes minimal" seems like the kind of thing that belongs in a commit message, not a code comment

llvm/test/CodeGen/RISCV/xray-attribute-instrumentation.ll
2–3

Triples are overly verbose; riscv32-unknown-elf is normally just written riscv32, and riscv32-unknown-linux-gnu as riscv64-linux-gnu, though I don't see what point having both serves, we normally only use the bare-metal triples unless something has an OS-specific aspect

ashwin98 added inline comments.Jan 28 2022, 6:24 AM
compiler-rt/lib/xray/xray_riscv.cpp
123–124

Right, I'll update this

128–129

Adding 0x80000000 may be enough, the lower 12 bits should be taken care of when we construct the lower 32 bits, if we choose to use two registers. If we wish to use one register to load all values, then 0x80000800 may be needed - I'm not too sure. About the MIPS and AArch terminology in some places - yeah if there's anything that is not official or consistent with RISCV, please let me know, I frequently consulted the files for the other ISAs to figure out XRay's implementation and ended up using inconsistent terminology at some places.

163

True

ashwin98 updated this revision to Diff 405432.EditedFeb 2 2022, 2:09 PM

Updated the riscv64 sled function to fix the addition/shift operations and get rid of the superfluous slli and srli instructions. Cut out unnecessary comments in the ASM Printer.

ashwin98 added inline comments.Feb 2 2022, 2:17 PM
llvm/test/CodeGen/RISCV/xray-attribute-instrumentation.ll
2–3

I will cut it down to just the ones with linux. Clang has a check to see if the OS is Linux when passed the flag for XRay instrumentation.

ashwin98 marked an inline comment as not done.Feb 2 2022, 2:18 PM
ashwin98 updated this revision to Diff 405621.Feb 3 2022, 6:45 AM

Removed extra triples from the test.

Fixed another lint issue, they should all be done for now hopefully.

@dberris Sorry, I'm a little confused, do you mean I need to update some of the clang tests for XRay instrumentation to test compilation and linking, or add new ones for the llvm-xray tool under llvm/test/tools?

Yes to both. :)

There are already some tests that ensure the supported triples build with XRay instrumentation and that we can get the instrumentation map with the tooling on binaries generated with clang+lld. Extending those to support risc64 and ensure we're able to find the instrumentation with the tooling gives us higher confidence that this will work when the patch is accepted.

ashwin98 added a comment.EditedFeb 11 2022, 8:21 AM

Fixed another lint issue, they should all be done for now hopefully.

@dberris Sorry, I'm a little confused, do you mean I need to update some of the clang tests for XRay instrumentation to test compilation and linking, or add new ones for the llvm-xray tool under llvm/test/tools?

Yes to both. :)

There are already some tests that ensure the supported triples build with XRay instrumentation and that we can get the instrumentation map with the tooling on binaries generated with clang+lld. Extending those to support risc64 and ensure we're able to find the instrumentation with the tooling gives us higher confidence that this will work when the patch is accepted.

Before running the tests, I've been testing some of the subcommands with the generated logs. When xray account or stack was run directly on the log file, the data for the instrumented functions was output (I believe the XRay tool recognizes it as a YAML file and processes it as such). However, when passing the --instr_map flag, issues crop up. I originally ran into an error about the executable not being an ELF binary. I confirmed that clang was generating ELF binaries. There was a test checking the triple architecture in InstrumentationMap.cpp that seemed to be causing this.

On including RISCV64 to the check, I see new issues while reading the instrumentation map. How do I verify that the instrumentation map is being generated correctly (or if there is a problem with it), and where is the code that is responsible for the generation of the instrumentation map (is it in xray_init.cpp)? I'm not sure if this is a RISCV compatibility issue with the xray tool, or if I've missed something that is causing problems during the instrumentation map initialization.

ashwin98 added a comment.EditedFeb 13 2022, 9:55 AM

Fixed another lint issue, they should all be done for now hopefully.

@dberris Sorry, I'm a little confused, do you mean I need to update some of the clang tests for XRay instrumentation to test compilation and linking, or add new ones for the llvm-xray tool under llvm/test/tools?

Yes to both. :)

There are already some tests that ensure the supported triples build with XRay instrumentation and that we can get the instrumentation map with the tooling on binaries generated with clang+lld. Extending those to support risc64 and ensure we're able to find the instrumentation with the tooling gives us higher confidence that this will work when the patch is accepted.

Before running the tests, I've been testing some of the subcommands with the generated logs. When xray account or stack was run directly on the log file, the data for the instrumented functions was output (I believe the XRay tool recognizes it as a YAML file and processes it as such). However, when passing the --instr_map flag, issues crop up. I originally ran into an error about the executable not being an ELF binary. I confirmed that clang was generating ELF binaries. There was a test checking the triple architecture in InstrumentationMap.cpp that seemed to be causing this.

On including RISCV64 to the check, I see new issues while reading the instrumentation map. How do I verify that the instrumentation map is being generated correctly (or if there is a problem with it), and where is the code that is responsible for the generation of the instrumentation map (is it in xray_init.cpp)? I'm not sure if this is a RISCV compatibility issue with the xray tool, or if I've missed something that is causing problems during the instrumentation map initialization.

I traced the root cause of the issue. It seems to be stemming from the instrumentation map's relocation handling, specifically, at line 129 of InstrumentationMap.cpp, when we try to extract the load address of the symbol. I believe that part of code was written keeping AArch64 in mind, but I'm not too sure about what changes will need to be made to add RISCV64 compatibility, I'm trying to figure it out.

On including RISCV64 to the check, I see new issues while reading the instrumentation map. How do I verify that the instrumentation map is being generated correctly (or if there is a problem with it), and where is the code that is responsible for the generation of the instrumentation map (is it in xray_init.cpp)? I'm not sure if this is a RISCV compatibility issue with the xray tool, or if I've missed something that is causing problems during the instrumentation map initialization.

I traced the root cause of the issue. It seems to be stemming from the instrumentation map's relocation handling, specifically, at line 129 of InstrumentationMap.cpp, when we try to extract the load address of the symbol. I believe that part of code was written keeping AArch64 in mind, but I'm not too sure about what changes will need to be made to add RISCV64 compatibility, I'm trying to figure it out.

If you can turn the relocations you're emitting in the assembly to be relative instead of absolute when building the instrumentation map like in other architectures, then the tooling will be able to resolve them. Maybe that helps?

I am still interested in a RISC-V XRay port :)

llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
280
llvm/test/CodeGen/RISCV/xray-attribute-instrumentation.ll
9

Use COUNT https://llvm.org/docs/CommandGuide/FileCheck.html#the-check-count-directive

See llvm/test/CodeGen/AArch64/xray-attribute-instrumentation.ll

55

See llvm/test/CodeGen/X86/xray-attribute-instrumentation.ll

use [[TMP:.Ltmp[0-9]+]] to avoid updating .LtmpN when there are somehow other temporary labels.

Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2023, 6:52 PM
Herald added subscribers: wangpc, luke, Enna1 and 2 others. · View Herald Transcript
MaskRay added inline comments.Jun 23 2023, 8:58 PM
compiler-rt/lib/xray/xray_trampoline_riscv32.S
17

Omit .file

Consider using #include "../sanitizer_common/sanitizer_asm.h" like aarch64 and x86-64.
This is in case we have RISC-V for other object file formats (say Mach-O) in the future.

I am still interested in a RISC-V XRay port :)

Sorry - lost track of this, things have been hectic for the last few months, but we also seem to have got custom events working with riscv64. Should hopefully be able to get back to this by some time in August.

evandro removed a subscriber: evandro.Jun 26 2023, 2:29 PM
ashwin98 updated this revision to Diff 552054.EditedAug 21 2023, 9:37 AM
ashwin98 added a reviewer: MaskRay.

Ok, I've tried to make the changes you've suggested @MaskRay , let me know if I've missed something.

On another note, I lost my previous qemu VM and have rebuilt LLVM with this patch in a new one; I am now getting the following issue when I try running a binary with XRay basic enabled:

SanitizerTool: CHECK failed: sanitizer_allocator_primary32.h:292 "((res)) < ((kNumPossibleRegions))" (0x7fff93a, 0x40000) (tid=24873)

A previous thread (https://reviews.llvm.org/D87580) suggests this might be a qemu issue, but I am using qemu-system-riscv64 (and not qemu user-mode.) However, I figured I'd check anyway.

compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
83

Just make RISCV32 available? Otherwise we wouldn't need riscv32_SOURCES

compiler-rt/lib/xray/xray_riscv.cpp
52

For new functions, consider dropping XRAY_NEVER_INSTRUMENT. The runtime library cannot be instrumented with -fxray-instrument and I an unsure why the original impl adds a lot of XRAY_NEVER_INSTRUMENT.

53
163

(x + 0x800) >> 12 is used many times. We need a helper like lld/ELF/Arch/RISCV.cpp hi20. Ditto for lo12.

287

Unimplemented features should use TODO instead of FIXME.

llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
322

I have many comments in D140727 (LoongArch port). Many are probably useful here as well.

ashwin98 updated this revision to Diff 553135.Aug 24 2023, 8:09 AM
ashwin98 added inline comments.
compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
83

Done

compiler-rt/lib/xray/xray_riscv.cpp
52

Done

53

Done

163

Done. I think I'd frequently looked at the MIPS trampolines while writing this code - I noticed that they were casting the Tracing Hook to int64_t, instead of uint64_t (and int32/uint32 for the 32 bit ISA), which requires the & operation, since >> is an arithmetic shift operation for signed integers. Changing the cast operations to uint64_t should eliminate those & operations without breaking anything else, right?

287

Done

llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
322

I had a look at this. Two things which stuck out to me (both in xray_riscv.cpp):

  • changing inline static to static inline (done)
  • your note about the PO_ style hurting readability when instructions are only used once. In this case, we're using some instructions repeatedly, so I'm guessing it makes sense to continue with the enum, but I can get rid of it if that works better.
ashwin98 updated this revision to Diff 554513.Aug 29 2023, 4:03 PM

Sorry - I missed adding updates with the last diff - I think I've addressed most of the comments. Just realized that the diff didn't include context, updated that as well.

MaskRay added inline comments.Aug 29 2023, 9:14 PM
compiler-rt/lib/xray/xray_trampoline_riscv32.S
55

This local symbol doesn't seem useful. We can just use numbers (1f and 1:) or LOCAL_LABEL(...), so that the symbol table will not have the unneeded symbol entries.

86

stray ASM_SIZE?

compiler-rt/lib/xray/xray_trampoline_riscv64.S
23

See the aarch64 implementation to use ASM_SIZE(__xray_FunctionEntry) to set .size

llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
185

omit braces for single-line single statement body.

llvm/lib/Target/RISCV/RISCVSubtarget.h
164

"does not support compressed instructions" makes this feature infeasible for many systems, I'll say probably almost all systems that may consider XRay.

ashwin98 updated this revision to Diff 555925.Sep 5 2023, 1:25 PM
ashwin98 marked an inline comment as done.

Sorry for the delay in getting back to you. I've updated the NOP sled generation and tested it out, it now seems to work with compressed instructions - the patched sleds were identical to what I'd previously observed when compiling with march=rv64g. When I had originally written this in 2021/early 2022, some of the instructions were getting corrupted when they were being patched in - either this was a bug in my implementation that has been fixed over the course of the review process, or it might've been an issue with qemu or llvm at the time that has been fixed in the years since.

ashwin98 added inline comments.Sep 5 2023, 1:27 PM
compiler-rt/lib/xray/xray_trampoline_riscv32.S
55

Done

86

Fixed - I looked at your AArch64 code and tried to follow the same process here

llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
185

Done