- User Since
- May 10 2016, 6:42 AM (359 w, 2 d)
Mon, Mar 27
Feb 13 2023
Thanks for double checking, I'll land it soon.
flang still uses this header.
Feb 9 2023
Hi @tlwilmar could you elaborate a bit on this part as it is very unobvious at first
Feb 2 2023
I believe that "unfettered" reads (and other uncontrolled changes) to the vl / vtype should not prevent us from creating an efficient intrinsics-based programming model for RVV and this change aligns with this view.
Dec 22 2022
I'm seeing failures on the llvm testsuite after this change (scalar only).
Oct 25 2022
I'm seeing failures in the llvm-testsuite on riscv64-unknown-linux-gnu with -O2 (no vectors) which git bisect attributes to this change.
Oct 11 2022
At risk of introducing noise: would it make sense to use some pseudo instruction (e.g. RISCV::PseudoVUNDEF_Mx) instead of the zero initializations and then drop it later (say when emitting MCInst)?
Oct 10 2022
Oct 6 2022
The whole proposal seems very reasonable to me and definitely an improvement to the current status quo. Thanks a lot for putting this together @jeanPerier.
Aug 31 2022
Good catch. LGTM.
Jul 27 2022
Jul 26 2022
Jul 13 2022
I'm too in favour of disabling subregister liveness until we can model the problematic instructions constraints in a more robust way.
Jun 28 2022
Jun 8 2022
This is a huge number of test changes so I have checked a subset of files but the changes seems reasonable, as I imagine they're generated mechanically.
(I was a bit confused by the refactoring of computeBuiltinTypes but it is definitely better this way)
LGTM. Thanks @kito-cheng
May 25 2022
Can we test some illegal type here (e.g. vscale x 16 x i64). The patch includes logic for that so it may be worth having it.
May 19 2022
May 13 2022
I added the link but I forgot the Differential Revision: marker. This was fixed in 189ca6958e84
Accidents happen.. Thanks for the prompt review @craig.topper.
I'll post a fix shortly.
Apr 28 2022
Just to understand the design principle behind these patches: adding FRM as Use to the v*cvt instructions (like we did in D121087 ) would not help us to implement floor and ceil. I imagine one option could be adding specific pseudos for round up and round down and then a later pass could set the rounding mode and restore it later. However, this would not give great code generation and it adds even more instructions to our already large number of pseudos. So your approach goes by adding a new operand with the rounding mode, similar to the scalar operations, and in D123371 you propose a pass that adjusts the FRM register.
Apr 27 2022
Apr 25 2022
Apr 21 2022
One interesting thing is that computeIncomingVLVTYPE doesn't seem to be fully aligned with what emitVSETVLIs will do. If the latter chooses to skip a vsetvl then the Exit of that basic block might potentially be different to the one that we determined in computeVLVTYPEChanges and computeIncomingVLVTYPE.
Apr 7 2022
Other than the nit above, this LGTM.
This looks good to me, based on what we discussed on D123179 with this change the stack looks like this
I'm ok with this once D123178 lands.
Apr 6 2022
This is a sensible low hanging fruit we want to have until VP_SETCC is taught more tricks.
LGTM. Thanks for the cleanup @craig.topper !
LGTM. Thanks @craig.topper !
Apr 1 2022
Mar 30 2022
I must be missing something very obvious: wouldn't 2.0 round to 2 and then give us i1 0 because its LSB is 0?
Mar 24 2022
I understand we need this for proper modelling of FRM when lowering constrained FP operations
Mar 23 2022
Mar 22 2022
- Improve the test with assertions verified by test_folding.py as suggested
Yes, SNGL is a specific intrinsic (defined in section 16.8 in Fortran 2018 standard), while DBLE is a generic intrinsic (defined in section 16.9). Intrinsic procedure call resolution (Probe in lib/Evaluate/intrinsics.cpp) replaces specific names by the related generic names. So SGNL is replaced by REAL there, but DBLE is not. So your patch in folding makes sense to me.
I was wondering why similar intrinsics SNGL and FLOAT work already but they seem to be lowered earlier to REAL, does this make sense?
Mar 17 2022
LGTM. Thanks for the cleanup @craig.topper
Mar 9 2022
I fear that adding more custom combined nodes would get unwieldy as they don't scale particularly well: we'd probably want corresponding ones for integer madd/macc, for narrowing operations, etc. They may inhibit theoretical optimizations we can perform on generic VP nodes. But the key word is "theoretical" and so that's just a gut reaction. We do have custom nodes for all of the widening operations, so it's feasible.
Hi @arcbbb I may be missing context here
Let's do that, otherwise there is doubt when to use one approach or the other.
Mar 7 2022
Ok, regarding the testing in sink-splat-operands-commute.ll I took the relevant operations in sink-splat-operands.ll and swapped the input so the broadcast operand is the first one. They generate the same input (the only difference is the basic block IDs).
Mar 2 2022
Thanks for the cleanup @khchen !
Feb 24 2022
I understand srcval here is
Feb 16 2022
LGTM. Thanks @khchen!
LGTM. Thanks @khchen!
Feb 14 2022
Feb 9 2022
To clarify I understand the motivation of this: lowering won't have to special case every constant in the AVL operand just in case it is VLMaxSentinel, right?
Feb 7 2022
Post-commit review: LGTM
Feb 2 2022
Ok thanks for the clarifications!
Jan 27 2022
Just to confirm, this is not fixed vector specific, is it?
Jan 26 2022
Looks reasonable to me. Thans @khchen!
Jan 20 2022
Jan 7 2022
Dec 30 2021
Dec 22 2021
Though perhaps we can improve the generation of -0.0 in another patch.
Dec 1 2021
Thanks for the patch @victor-eds!
Nov 18 2021
Nov 11 2021
Being myself far from an expert in floating-point, the logic and generated code seem correct to me.
Oct 27 2021
I'm curious about how we handle _Float16 here.
Oct 22 2021
Looks good to me too. Thanks a lot @craig.topper !
Oct 20 2021
Oct 18 2021
Thanks @dcaballe for the patch. A bit surprising that we didn't notice this earlier but I guess masking does not get used that often.
Oct 12 2021
No objection with this, I guess it is for consistency with GPRs, right?