- User Since
- Jan 4 2017, 12:12 PM (187 w, 5 h)
Looks fine, other than please mention something about it being an alias in the commit subject when landing as currently it reads as if it's a new CSR (and I'd also personally change "This patch enables the use of mucounteren." to "This patch enables the use of the old mucounteren name." to make it completely clear, but if the subject is fixed then I don't think it's _necessary_).
Sun, Aug 2
Patch looks good but please fix the title and description of this revision. It's not about adding AltName, it's about adding mountinhibit itself, and as part of that you're including support for the older name too.
Sat, Aug 1
Actually the code in the backtrace points at D82651, not this. I did however see buildAnyextOrCopy in a backtrace when playing around to get that minimal case, which is what brought me to this revision. Investigating further, it seems that this is because of D82651 not fully implementing this case, and future revisions not fixing that. Prior to that revision, inline asm would hit reportTranslationError and then fall back to SelectionDAG due to isGlobalISelAbortEnabled (provided -global-isel wasn't passed to override that, as is done in my test, but not the original C), but now, since it claims to be implemented but not correctly, it instead asserts deep down and is unable to fall back to SelectionDAG.
I believe this has broken AArch64 (-global-isel not required due to defaults, but it's a GlobalISel bug so should force it nonetheless):
Fri, Jul 31
Thu, Jul 30
There is a possibly-compelling argument against using x18: RV32E only gives x0-x15, so would not be able to support the current implementation.
Given @luismarques's comment, have you now actually run the tests (and I mean all RISC-V tests, not just branch-relaxation.ll, in case anything has been missed)? Perhaps I shouldn't take it for granted that people run tests before submitting patches, including revisions (though I'd still verify myself before committing anything), but you really should as otherwise it just wastes our time having to tell you they're broken when you could just run them yourself.
Wed, Jul 29
You've lost the check for the non-PIC one and didn't check that the PIC one actually assembles (it really should, but you never know; the point is that unless you emit a .o the fixups are never actually computed and thus you won't get errors if they end up overflowing).
(and my ; TODO: Extend simm12's MCOperandPredicate so the jalr zero is printed as a jr. then goes away, not because that's been fixed, but because by using PseudoJump we no longer expose it... perhaps that comment should be relocated to RISCVInstrInfo.td's definition of simm12, but it's a minor stylistic thing, it doesn't really matter beyond not being the canonical alias)
This looks good to me now, let's see what others think. But please update llvm/test/CodeGen/RISCV/branch-relaxation.ll to reflect the change in CodeGen from LUI/JALR to AUIPC/JALR, and add PIC RUN lines. Plus whilst you're there I guess it wouldn't hurt to add RV64 variants too.
Although I feel PseudoJump might be a sensible choice even for position-dependent code? By this point we know the branch target is more than 2 MiB away, and I assume that neither LUI/JALR nor AUIPC/JALR will be either relaxable at link time (branch target would need to be a low address) or compressible. Plus AUIPC/JALR is more like JAL in being PC-relative, so it "feels" more "faithful".
Please avoid duplication. Something like this should be better in that regard, both for duplicating between the if/else branches and duplicating with the existing AUIPC-based expansions elsewhere.
(the fact that FreeBSD kernel builds could trigger relaxation is what exposed the bug fixed in https://reviews.llvm.org/D77443)
Thu, Jul 23
- Please use local variables with meaningful names for RISCV::Xn rather than inlining them everywhere and making it harder at a glance to work out what's going on without knowing what the ABI register names are.
Is there a reason for choosing X18? On AArch64 that's either a temporary or saved register depending on ABI, but determined to be the "platform register". Picking X18 on RISC-V because that's the same index as AArch64 seems a little arbitrary, but maybe it happens to make sense.
Wed, Jul 15
Jul 5 2020
Jun 30 2020
My question is: why do we want this additional complexity? F and D both require FMUL/FDIV to be implemented, so saying you support F (and D) but no FMUL/FDIV is a contradiction, no such implementation can possibly exist (it would instead be say RV32I plus a non-standard extension that looks like a subset of the floating-point one). If you want to optimise for area, don't include an FPU, and if you want speed, include one, but this seems like a strange halfway house. Just because GCC has an option for something doesn't mean it makes sense for us to copy. What is your actual use case for this? Are there RISC-V implementations in the wild that do this and we're missing out on being able to target them, or is this just a hypothetical?
Jun 28 2020
May 15 2020
Yeah, I don't think we want to be merging code we can't test even in a non-automated way. Even if this code is completely bug-free, the inability to test it just means we risk having it bit-rot with nobody noticing.
May 12 2020
Macro-op fusion likely means that we'll always schedule the AUIPC with its consumer anyway, but in theory there could be pipelines that don't macro-op fuse and stall on RAW hazards, which could benefit from splitting the pair up.
May 11 2020
I am still very confused why this is necessary. CanLowerReturn barely does anything other have a loop to call other functions that actually do useful work, it seems like the wrong place to hook into things. Your "explanation" is merely just "we want this because we want this". Could you please shed some light on how _exactly_ this is useful? As it stands I am none the wiser.
Having createFPImm and createSFPImm (similarly for the add methods) seems confusing. Can we not have a uniform SFP/DFP split in the names everywhere, since you're already touching the uses of the former due to changing the types?
May 6 2020
May 1 2020
I'm curious as to whether this is actually useful. At least for our CHERI fork, we are tightly integrated with the existing code, reusing bits here and there and hooking in where we need to override things. Are you really able to implement any meaningful extensions by overriding the functions without simply duplicating a lot of the logic (which, whilst it avoids merge conflicts, means that any changes and bug fixes likely don't get copied to your forked functions)?
Apr 29 2020
I'd argue more of the comments than those highlighted are also pointless. There's no point writing a comment spelling out what the next line of code clearly does. They should be reserved for overviews of what's going on, and the details of why the code is as it is. I'm also curious as to why you went for the optimisation relaxations rather than R_RISCV_ALIGN. Support for the latter is mandatory for correctness when using -mrelax, but the former is purely optional. Therefore, I would strongly argue that support for R_RISCV_ALIGN should be added before or in the same commit as any R_RISCV_RELAX optimisations, otherwise we're just shipping something that's knowingly broken.
Apr 28 2020
Apr 23 2020
Apr 19 2020
Apr 12 2020
Apr 10 2020
Apr 9 2020
Apr 7 2020
Just as a note for things that are still missing:
Apr 3 2020
Apr 1 2020
Mar 11 2020
Mar 10 2020
I believe my previous comments have indeed been addressed.
Feb 26 2020
Feb 20 2020
Feb 18 2020
Feb 17 2020
Feb 12 2020
Feb 11 2020
Removed atomic-rmw.ll change; nothing actually changed in there other than comments and a missing . on labels, so that churn can be postponed until it actually gets touched in a meaningful way.
Jan 31 2020
Indexed load and store fusion should not be exclusive to LD and SD. It applies to any (F)Lx/(F)Sx. The literature just talks about LD/ST (note that this is not SD) in the sense of an arbitrary load/store instruction. Similarly for the various other fusion pairs involving memory accesses.
Jan 23 2020
Switch from explicit iterator to recently-added range-based for loop interface. Ok to merge?
Jan 22 2020
Pass correct target to shouldForceRelocation (although unused).
Addressed review comments.
Jan 20 2020
Never mind, I see why this is done, we need to respect the ordering, so skipping the SC misses the release.
Jan 14 2020
Rebased and addressed review comments.