- User Since
- May 24 2016, 8:35 AM (201 w, 3 d)
Nice one. LGTM
Thanks, much appreciated. And thanks for adding the extra test.
Yeah, the multiple uses are a pain.
Wed, Apr 1
The code looks OK. I think update_llc_test_checks should work, I've used it elsewhere in the past.
The code that already exists in the function you are changing is essentially already doing the same thing as you add here, just in a more constrained set of circumstances. It is saying that if the operands are obviously invertable, then the VPNOT will be really free and we can go ahead and invert the and to an or. The question is if that is true for all cases or not. For the test cases you have here it is true that the operands are easily invertable, but that won't be true for everything.
Oh yeah. Because we are now sinking float splats. Still LGTM.
Hello. Unfortunately we have some tests where it looks like this is needed. This code: https://godbolt.org/z/DpJuFm is no longer reaching the same minimum after this patch. Any ideas?
Tue, Mar 31
Mon, Mar 30
Hello. I like the idea. It's something we thought about internally but no-one has ever worked on enough to see how much of an improvement it gives in general.
We had these patterns before and took them out because they were not correct. My understanding is that these instructions do trunc(shift(add(sext(a), sext(b)), 1)). They internally operate in a higher bitwidth than we natively have.
Sun, Mar 29
Tests look nice on this one.
Fri, Mar 27
Unpredicatable -> Unpredictable
Thu, Mar 26
Added some tests for both top and bottom vmull's
Wed, Mar 25
Unfortunately, I've written this pass in a single commit, so there is no easy way for me to split this patch in 2.
I can do it if you want, it's not impossible, but it's going to take me a while to get right. Also, the second optimisation is a relatively small part of this patch, so patch #1 would still be a large patch.
Tue, Mar 24
Is it possible to split this into two patches? The pass and "replaces VCMPs with VPNOTs when possible" part, then the second part to replace the re-use with the not. I think that would make each part easier to review, more manageable.
We added a VECTOR_REG_CAST, which it like a bitcast but doesn't change the bits. Similar to the AArch64 NVCAST.
No longer a virtual method.
Mon, Mar 23
Amazing. I was going to say I'm not surprised we get big endian wrong. But it's a movimm. I am a little surprised.
I see. Because we are just swapping around the same values anyway. Makes sense.
I have altered the way that MVE VDUP are lowered, and we end up with a VMOV now instead of using a COPY. That means I don't need this any more for MVE at least.
Rebased onto the VDUP type changes.
Sun, Mar 22
Fri, Mar 20
Sounds great, from what I can see. The predicated lowering looks useful when/if we try and get predicated vecreduce's working.
Thu, Mar 19
I'm still not convinced that this shouldn't be done in ISel. There's nothing cross-block going on, so this is what ISel is designed for. It might make sense not to think of this as trying to convert vector_shuffle to something else, but instead trying to convert what vector_shuffle has turned into into something more optimal. In that one case I looked at, there was something like a (v4i32 (ext (v2i32 (buildvector (v4i32,..))). We get this way because a v2i16 was legalised to a v2i32, but everything around it was a v4i32. Can we "flatten" the ext into a single BUILDVECTOR? I have not had time to see if that is or isn't possible, but it sounds more sensible than very special case pre-isel legalisation for certain shuffle_vector's.
Tue, Mar 17
Mon, Mar 16
Now using TargetRegisterInfo::shareSameRegisterFile.
Fri, Mar 13
Thanks for the suggestion.
Thu, Mar 12
Wed, Mar 11
Good to see this getting updated.