- User Since
- Jun 18 2014, 2:14 AM (292 w, 1 d)
Jun 7 2019
Jun 12 2018
I don't understand what you're saying. Once you shift, you know that the lowest-order bit is zero. You can naturally combine that with whatever other information also happens to be known about idx.
knownZeros(n) == (knownZeros(idx) << 1) | 1 and knownOnes(n) == knownOnes(idx) << 1
I meant if nothing is known about idx, we fail to get No-alias opportunity. If there is nothing for KnownBits, let's just check odd number and the idx is used by both offsets as below.
Apr 11 2017
Apr 10 2017
Apr 9 2017
Apr 4 2017
Mar 24 2017
I believe the argument lacks numbers (or at least you have them but didn't mention). I didn't hear about performance results, or validation that this was actually tested for correctness. Small test cases prove a point, but can't be considered general.
OTOH, it seems like this is exactly why you want the flag, to hide any potential issues and play with it. I'm not opposed to adding the flag if there's a commitment to actually get the result and change the default to whatever seems to be better, does that seems reasonable?
Mar 23 2017
Preserved vec3 type on __builtin_astype.
Mar 22 2017
Yes. This would make sense. I am guessing that in vec3->vec4, we will have 3 loads and 4 stores and in vec4->vec3 we will have 4 loads and 3 stores?
Mar 21 2017
Mar 16 2017
Mar 15 2017
The motivation doesn't seem solid to me, who else is going to benefit from this flag? You also didn't explain why doing this transformation yourself (looking through the shuffle) on your downstream pass isn't enough for you. We generally try to avoid adding flags if not for a good reason.
I believe the assumption is more practical: most part of upstream llvm targets only support vectors with even sized number of lanes. And in those cases you would have to expand to a 4x vector and leave the 4th element as undef anyway, so it was done in the front-end to get rid of it right away. Probably GPU targets do some special tricks here during legalization.
I have compiled below code, which current clang generates for vec3, using llc with amdgcn target.
Mar 14 2017
Mar 13 2017
Mar 10 2017
I am so sorry. I missed it. I have updated it.
Changed help text for option and Added test file.
Added -f prefix to option name.