- User Since
- Nov 20 2019, 6:41 AM (72 w, 5 d)
Fri, Apr 9
- Removed extra assert for case where we are interleaving with masks
- Removed warnings check line.
- Combined all tests into one file.
Sure, I can combine them into a single test if you think that's better? I wasn't sure whether in general we prefer to have more concise test files or to have fewer, larger files. I could create a file with a name like sve-type-conv.ll.
Hi @frasercrmck yeah that's right, we can multiply an invalid cost by another value and the cost remains invalid.
The folds here look sensible to me! I just have a minor comment about the add fold.
Thu, Apr 8
Hi @cameron.mcinally I don't suppose you have any plans to do more work on this in the near future?
Wed, Apr 7
- Added a comment and removed warnings check from test.
- I realised that there was still a broken case where the PHIs are treated as uniform, which leads us to fall into the path where the PHI is scalar after vectorisation. We can support the uniform case for scalable vectors. I've fixed the code to work for scalable vectors and added a test.
Ah, sorry, that was just a simple mistake and forget to "Accept Revision"!
Tue, Apr 6
LGTM! Thanks a lot for making the changes.
Thu, Apr 1
A version of this patch was previously merged (D98512) then reverted due to a failure with the X86 sanitiser build that exposed some missing tests from our LLVM test suite regarding pointer PHIs feeding directly into stores. I've attempted another fix here without the previous assert because the logic seems far too complicated for an assert.
Tue, Mar 30
Fri, Mar 26
LGTM with nit addressed!
LGTM! Thanks for the changes!
Hi @nasherm, thanks for all the good fixes here! I think it looks very close now - just have a couple of minor comments. I'm fine with most of the costs as the most important thing for now is to avoid crashing and we can refine these later if necessary. However, I think the truncation costs probably do need fixing.
Ah sorry, I realise now you meant change PHIs (vector or scalar) with nsz and -0.0 to use +0.0 instead!
Hi @dmgreen and @spatel, I've been trying to follow the discussion but I'm not entirely sure I follow what you're proposing @kmclaughlin should do here? Are you suggesting that Kerry could change this patch to:
LGTM as well! Can address Dave's comment before merging I think?
Wed, Mar 24
Tue, Mar 23
Closed by commit 748ae5281d4f7f0ff261ba9e8c57e6b6fcfdb31e
Hi @nasherm, thanks for addressing the previous review comments - it looks much better now! I just have one more comment about the estimated costs in the table.
Mon, Mar 22
- Added verifier tests.
LGTM! Hi @nasherm, thanks for addressing all the review comments! I just have a few minor nits.
- Added an assert that if an instruction remains scalar after vectorisation, then both the instruction and all it's users will not be scalarised. If any users were to be scalarised then potentially that means multiple copies of the instruction will be generated. In the case of GEPs the cost is actually calculated as part of the load/store cost.
- Updated LangRef to better describe the intrinsic behaviour.
Fri, Mar 19
- Fixed buildScalarSteps so that generate the full vector part for VF=(1,scalable).
Thu, Mar 18
- Changed code to check for element sizes >= 8 bits instead of 1!
Wed, Mar 17
Yeah I'm fine with that if @dmgreen is happy? It makes to be consistent with the RecurrenceDescriptor code. I think from what I understand clang will only generate IR that contains the reassoc flag if we've set all the appropriate frontend flags. Therefore, currently the only ambiguity at the moment is when hand-writing IR and using the reassoc flag without nsz, right?
Hi @dmgreen, yes of course you're right. I'd forgotten about the nsz requirement. It's definitely needed at compile time for vectorising FP reduction loops, i.e. clang -freassociative-math -fno-trapping-math -fno-signed-zeroes. I guess adding a check for nsz here is consistent with that?
- Addressed review comments.
- After recent upstream discussion on the SVE sync call it was decided to remove support for vectors of i1 types.
Tue, Mar 16
Mon, Mar 15
Hi @fhahn, I tested the example you pasted above before and after applying my patch and got this debug output:
Although I think the adds you mentioned above @fhahn don't fall into the Scalars variable - I'll double check this though. This patch only affects instructions that are members of Scalars. Users of the IV don't fall into that - only updates to the IV I think.
Thanks for the review @fhahn. If anything this just highlights the poor code coverage we have at the moment for cost calculations when scalarising. At the very least I'll add some tests that cover some of these corner cases, since there was nothing that really broke when I made this change!
I guess for bitcasts the cost may be non-trivial for big endian targets.
Hi @dmgreen, thanks for taking a look at the patch. So you raise a good point - the loads/stores will be dealt with before reaching this variant getInstructionCost, however the GEPs/Bitcasts for those will still be costed here. However, for GEPs at least you can see this code in getInstructionCost:
Mar 12 2021
Hi @sdesmalen, thanks for pointing this out! It looks like for the cases you listed above we actually have: