User Details
- User Since
- Apr 19 2018, 4:51 AM (257 w, 3 d)
Apr 4 2021
Dec 15 2020
Dec 2 2020
Oct 27 2020
Mar 19 2020
Feb 14 2020
Feb 12 2020
Feb 3 2020
Feb 2 2020
(This was gonna be an inline comment on D69891, but it's more of a general conceptual issue, so I decided to move it here.)
Jan 31 2020
I'm not sure what problem you think there might be? Both code sequences do the same thing (same side effects, same final result) as the input IR they matched, right? So that's what justifies them both as valid outputs and the choice is just a matter of codegen quality. You don't even need to appeal to the vp.fadd producing undef in disabled lanes, because in the final result those lanes are zero anyway and that's all that matters. This doesn't seem fundamentally more tricky than any other isel pattern that matches multiple IR instructions to produce a more efficient combined instruction. For example, if the ARM backend selects add i32 %a, (shl i32 %b, 4) as add r0, r0, r1, lsl #4, it never materializes shl %b, 4 (not into a register, at least) but the end result is still correct.
Dec 3 2019
Dec 2 2019
Nov 27 2019
Nov 24 2019
Nov 23 2019
Aug 4 2019
Clearly the current semantics of LLVM IR have a forward progress guarantee and I agree that should be documented.
May 10 2019
I know very well how annoying it can be to read and write (and say) the scalable prefix all the time and wish for something shorter sometimes, but I also prefer <vscale x ...> for the reasons Sander gave. I'll add that <vscale x 4 x i32> feels a bit lighter than <scalable 4 x i32> even though it's the same number of characters (maybe because there's more whitespace?).
May 1 2019
Apr 5 2019
Mar 7 2019
Feb 5 2019
Feb 4 2019
Nov 7 2018
Nov 5 2018
Today I took a stab at changing my RVV patches to use these intrinsics and that basically went well, affirming belief that these intrinsics are a good fit for RISC-V vectors. I stashes those changes for now rather than continuing to build on them because currently I can't match them with plain old isel patterns so I'd have to write annoying and error-prone custom lowering. That should be a temporary issue, partly due to how I don't really handle predication at the moment, partly due to a surprising extra argument on loads and stores (see inline comment).
Nov 3 2018
Oct 31 2018
This seems like yet another step in the right direction. Of course I may be biased as I've already been happy with previous iterations.
With the semantics defined in @simoll's proposal, the active vector length is actually subtly different from predication in that the former makes some lanes undef while predication takes the lane value from another parameter. I actually don't know what motivates this, in RISC-V masked-out lanes and lanes beyond VL are treated the same and this seems the most consistent choice in any ISA that has both concepts (and ISAs that only have predication would legalize the latter with predication so they too would treat all lanes the the same). Is there an architecture I'm not aware of that makes past-VL lanes undef but leave masked-out lanes undisturbed?
Oct 29 2018
Oct 24 2018
Thanks a lot for this proposal! It's very unfortunate I couldn't be at the dev meeting to discuss in person.
Sep 11 2018
I think I found a typo, but otherwise LGTM too!
Aug 31 2018
Aug 16 2018
The implementation looks good to me. The interface chosen here (directly mirroring CmpInst from the Value hierarchy in the VPValue hierarchy) also seems like the right direction to me. Besides avoiding the problematic concept of "underlying Instructions" altogether, it also gives a convenient place to put any helper functionality that the vectorizer code might want when generating and manipulating such comparisons.
Jul 20 2018
Jul 19 2018
Jul 16 2018
Thank you! I took another look and found two nits, sorry for not pointing them out earlier.
Jul 12 2018
I just realized the updated RFC doesn't touch on the issue at all, but I think it's safe to say we won't support globals of scalable vector type? Those seems impossible to implement in a sensible way for RISC-V, and if my memory and quick skim-reading is correct, it isn't part of the SVE C language extensions either. If that's correct, I'd expect the verifier to reject global variables whose type is a scalable vector.