- User Since
- Aug 6 2013, 5:31 AM (258 w, 2 d)
Hi Jocelyn and welcome to the LLVM community. I'm really excited about this approach to testing the assembler. I've just got a few comments for now.
This looks good to me with two caveats
This is looking good to me, I've added some comments. The logic around callee/caller save registers is a little confusing but the way you're handling it seems sensible. A comment clarifying things in determineCalleeSaves would be really useful I think.
Could you add a test that shows .4byte and .long are now correctly handled too? e.g. by extending micromips-label-test.s
Thu, Jul 12
Mon, Jul 9
Adding Philip Reames as a reviewer, who kindly offered on Twitter to take a look at this.
Wed, Jul 4
Sure, but it does sound like the handling of .4byte, .long may incorrectly set STO_MIPS_MICROMIPS in the current implementation so perhaps there's a bug to be filed.
Mon, Jul 2
Thanks Ana. I've added some initial review comments. Sorry for the slight delay, I've been on vacation the past week.
Thanks for the review and sorry for the delay in committing. Committed in rL336100.
Thu, Jun 21
Hi Ana, if you're planning on updating it would probably be worth following the work on a replacement for SearchableTable:
This was fixed in rL335202. In general, do feel free to direct commit minor test or build fixes without pre-commit review.
Wed, Jun 20
Looks good to me, thanks!
Jun 14 2018
Hi Shiva. I think your tests need to demonstrate the behaviour when there is no C extension support as well.
Thanks! Looks good to me, but two minor suggestions:
- arith-with-overflow.ll might be a better name
- Add the nounwind attribute to the function definitions, so we still get clean assembly output when cfi directives are emitted.
This looks good to me (in that the motivation makes sense and the change seems to implement the desired change). Only caveat is that I'm not familiar with the code paths for bundled instructions.
Thanks, this looks good to me assuming the minor nits are addressed. The only one requiring code changes is the for loop in matchLargeOffset.
Jun 13 2018
Thanks, I'll have to finish stepping through the code tomorrow but this seems to be working great. For the GCC torture suite at O1, ~40 of the test cases see improved codegen (fewer instructions). Two gain a single instruction, but that's just noise due to slight regalloc differences.
- Expand masked AMOs to IR for address calculation and masking + an intrinsic for the core LL/SC loop. Introduce a new hook in TargetLowering to allow this
- Support for part-word atomicrmw max/min/umax/umin
- Use MaskedMerge rather than calculating an inverse mask
- Expand atomicrmw sub i32 to (AMOADD_W GPR:$addr, (SUB X0, GPR:$incr)) with appropriate AQ/RL bits
Update to add atomicrmw max/min/umax/umin to tests.
Jun 8 2018
Update to use fence.tso for fence acq_rel.
Jun 7 2018
Looks good to me. Many thanks for your work on this - I'm committing now.
This patch now enables lowering of 8, 16, and 32-bit atomic load/stores. We no longer rely on D47553, as the follow-up patch to lower atomicrmw supports partword and native size atomics.
Thank you everyone for your comments. I've removed the dependency on this patch from my queue of patches, and instead implement 8/16/32-bit atomics in one go.
Update to address all outstanding review comments.
Jun 1 2018
May 31 2018
Update to address comments from @jfb (thanks!).
Thanks Ed, looks good to me.
Updated patch to document that shouldExpandAtomicToLibCall should return the same result for objects of the same size.
I'm going to un-abandon this patch. This hook is still helpful in order to e.g. lower to __atomic libcalls for all 8 and 16-bit operations but not 32-bit. This seems acceptable based on the linked GCC and LLVM docs, but isn't possible with the current MaxSizeInBitsSupported. Am I missing any problems with that strategy?
May 30 2018
Are there any plans to push this forwards?
Thanks Eli and JF for the rapid feedback. I was too quick to try to replicate GCC behaviour without running sanity checking its output.
Thanks Eli, I actually just came to the same conclusion when taking a step back and thinking about it some more. I was caught up on matching RISC-V GCC behaviour in this case, when in fact it seems RISC-V GCC is broken here. The LLVM docs on atomics also do document this requirement https://llvm.org/docs/Atomics.html#atomics-and-codegen. The 'blunt instrument' of MaxSizeInBitsSupported is acting as designed.
May 29 2018
Thanks, looks good to me. Probably worth retitling the commit to e.g. "[RISCV] Support resolving fixup_riscv_call and add to MCFixupKindInfo table"
Ok, there _is_ a problem using addi rather than addiw. Consider loading the immediate 0x7fffffff.
May 28 2018
Looks good to me, thanks!
May 24 2018
@echristo: Do you think you might be able to help review this?
I'd still err towards .option norelax meaning "do what I say". If a user freely intermixes .option relax and .option norelax within a section then there's a whole bunch of things that might break. Simply ignoring the norelax doesn't seem right, and picking out particular cases (e.g. symbol diffs) where we'll act as if relaxation was enabled also seems a bit dodgy. I still think that erroring might be safest and simplest path.
By way of update: the only thing stopping me from committing this right now is convincing myself that there's never a reason to use addiw rather than addi for the first constant when expanding li for RV64. I'm just going to step through some more examples...