This patch adds stackmap support for RISC-V without targets (i.e. the nop patchable forms).
|60,070 ms||x64 debian > MLIR.Examples/standalone::test.toy|
Thanks for the contribution! I'm not very familiar with the stackmap/patchpoint/statepoint infrastructure so it may take a bit of time to review this (if anyone else on the CC list has looked at it before, please do dive in!).
One quick question: do you expect this to work on RV32 as well? If so, it would be good to have test coverage for that also.
Hello! I am happy to help! Since it is my first contribution, I am not yet familiar with everything, so it might be a little bit messy at the start, sorry in advance.
For your question, I only implemented it for RV64 for another project, and from what I can think of now, the only issue for the current implementation to work on RV32 would be the method RISCVAsmPrinter::LowerPATCHPOINT, as it materializes a 64 bits jump address. From what I see in the other architectures, the materialized jump address always seems to be 64 bits (if I did not miss anything). Thus, either it can work for 32 bits architecture too, or those architectures only implement 64 bits. I will try locally to check this.
I looked a bit deeper into a RV32 implementation, but in fact, according to this https://llvm.org/docs/StackMaps.html#stack-map-format, stackmap support is currently only implemented for 64-bit platforms. Thus it might require more work in the architecture independent part of stackmap, which I am not totally familiar with for now and I also imagine it would be better to start with the more common architectures instead of RV32.
Don't forget to include the full context in your patches. See https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface
It looks like you also want to use clang-format to reformat the code.
Shouldn't the lowering code check that the subtarget is RV64, and not RV32?
Sorry for the delay, it took me some time to run clang-format. I also added the context as asked. Finally, I added an assert to make sure STACKMAP, PATCHPOINT and STATEPOINT are only used with RV64.
I'm not seeing the codepaths for the case where the intrinsics are used on RV32? You'll want to use report_fatal_error rather than assert (as the assert wont' be checked on release builds).
I replaced assert by report_fatal_error to indeed stop the execution when reaching this point. I did not include any code for when the intrinsics are run on RV32 as accordingly to the LLVM documentation, Stack map is a 64 bits feature. Thus I only implemented it for RV64. Should I do it for RV32 too?
Nit: put some ' digit separators to make it easier to read the constant. E.g. 0x0xFFFF'FFFF'FFFF. (Or use some higher-level function from MathExtras.h if there is one).
Can't we do better than this for materializing the address? If the number of instructions doesn't have to be fixed you can even use generateInstSeq from RISCVMatInt.h to generate a more optimal instruction sequence.
You repeat this code a number of times. Please pull out a helper function EmitNops(N), and maybe one for EmitNop as well.
You might be able to reuse AP.emitNops here, but that returns a fixed number of nops, not a fixed number of nop bytes. Looking at the x86 parallel, I think that's what you want here.
CallTarget can be a global symbol as well as a constant.
I'd suggest splitting this patch into support for patchpoint/statapoint without targets (i.e. the nop patchable forms), and then adding the target resolution separately. The call resolution adds a decent amount of complexity, and is worth reviewing in detail on it's own.
Seeing this on the AsmPrinter is suprising. I'd have expected this code to be in RISCVMCInstLower.cpp. Is there something I'm missing here?
I removed the code related to the target resolution, to put it in another patch as suggested. I still took the comments into account and they will be applied in the other patch. I also used emitNops to avoid code duplication when emitting the Nops.
Applied in next patch.
generateInstSeq is indeed better, thank you!
It seems to be the case for most architectures. It is true however that the lowering of those instructions should be in RISCVMCInstLower.cpp as it would be more consistent. Should I move it there?
You should ignore me here. On a quick skim, I was confusing being in the MCInstLowering with not being on the AsmPrinter class. I was wrong, and you have the right placement.
Is it still invalid with RVC? You can have 2-byte NOPs there...
RISCV doesn't have FastISel support, it is identical to the previous line
Don't line up despite your attempts to make them
|1 ↗||(On Diff #445094)|
You've got no CHECK lines because this is outputting MIR not assembly, hence you want update_mir_test_checks.py
|16–18 ↗||(On Diff #445094)|
I indeed missed the C_NOP. Should we add one when NumNOPBytes % 4 == 2 then, because emitNops only emits 4 bytes NOPs?
I will remove it then.
|16–18 ↗||(On Diff #445094)|
This was added by update_llc_test_checks.py.
Why do you have both stackmap.ll and rv64-stackmap.ll? And why does stackmap-frame-setup.ll not have an rv64- prefix like the rest (once that duplication is resolved)?
I don't know, it depends what you want stackmap code to do, which is something I'm not familiar with. All I know is that, without additional context, requesting 18 bytes of NOPs is a perfectly reasonable thing to do on RISC-V, just as requesting 7 bytes is reasonable on x86 due to it having 1-byte NOPs (0x90). I'll note that @reames already pointed out the discrepancy between emitNops and getNumPatchBytes in terms of bytes vs instructions below.
Also, AsmPrinter::emitNops emits whatever TII->getNop() returns, which is C_NOP on RISC-V if RVC is enabled, so you currently have a bug that this is half the number of bytes being requested in that case.
You have a bunch of spaces like this left in various files
Why's this in a test? This is documentation for how to use llc...
|3 ↗||(On Diff #445564)|