- User Since
- May 24 2016, 8:35 AM (173 w, 5 d)
This is because the Rda parameter of a vminv is treated as being the same size as the elements? Not as a 32bit integer.
Sat, Sep 21
Nice. LGTM then.
Fri, Sep 20
Is it possible for the CPSR to already be live? Live across the loop or something like it?
Thu, Sep 19
Tue, Sep 17
I'm afraid the upstreaming of CMSE has stalled, and this is not all that would be needed to get it working. This adds some header files and clang builtins, the selection of them in the backend isn't yet present, hence the error you are seeing. There are more patches to follow around lowering intrinsics and clearing registers correctly.
Hello. Yes, that should all be fixed now. Let us know if anything else is looking off.
Now using long_shift as an ImmLeaf.
The widening/narrowing MVE loads and stores, like MVE_VLDRBU32 (the B with the 32, in this example) can only take "thumb" registers as they only have 3 bits for the Rn operand. Which is what I meant by "can't take SP". This is not something that any other stack load/store has had to deal with in the past, and hence RegClass->contains(ARM::SP) is added here to check for such cases. Similar to https://reviews.llvm.org/D66285.
The AddrModeT2_i7, etc can still be generated from the non-widening forms of the MVE loads/stores (as they accept a GPR not a tGPR they can take SP too). I don't think the narrowing loads (the ones that cannot accept a stack pointer) will be generated any more, but I'd prefer to fix these cases nonetheless. They may change in the future and these failures are silent until the function gets complex enough to trigger them, so can be difficult to test for.
Mon, Sep 16
Thanks. You mentioned something about nested loops. Is that still an issue?
Will tracking liveness this late be OK? I have a memory of it not being reliable, but that might have been fixed up recently?
Yeah, it's a shame about the masked ld-st tests. They appear to be in a different order, causing the extra register usage. Seems like more of a scheduling problem. The tests in mve-pred-bitcast.ll and mve-pred-loadstore.ll are both smaller (and the masked load-store tests will eventually generate narrow/widen masked loads/stores.)
Sun, Sep 15
So we unroll the loop if the sizes are equal too? Sounds sensible to me.
Thanks. Good to know I'm not barking up the wrong tree.
Thanks. Sorry, I was on a bit of a bug hunt, and this is the opposite of that ;) It's behind an option though, so should be fine.
Fri, Sep 13
Like it. LGTM.
FYI: rL371817, in case it changes what is done here.
Thu, Sep 12
Nice. I was expecting this to happen earlier, maybe in the other branch optimisations. This seems like a good place for it, though.
Thanks! Looks good to me.
Wed, Sep 11
Tue, Sep 10
Mon, Sep 9
Nice one. LGTM.
Sorry, yes, I think this makes sense over the alternative of analyzeBranch not setting TrueBB/FalseBB. LGTM
Now with v16i1, which doesn't need the extracts/buildvector, just going through the vmrs into a vstrh store.
One question though. Loading and story and story data.....do we need to worry about LE and BE here?
Sun, Sep 8
Change looks sensible to me, but how difficult would this be to do without the, umm, creative use of foreach?
Fri, Sep 6
LGTM. Nice one.
Very nice. Oliver went and put together a few patches for mul, add and sub, in https://reviews.llvm.org/D67268 and related. So we now (or soon will) produce the other instruction this can effect. This is good to go I think.
Nice. Can you add tests for the %src and %sp being the other way around in the mul too? I think the same patterns should catch them, but it's worth having the tests.
Thu, Sep 5
It turns out that I do really need this to stop the old version from crashing. But I think this patch is trying to do too much, both fixing the bugs and adding VPNOT's at the same time. I've re-done the fixes as D67219, and will get back to this later.