During legalisation we can end up with some pretty odd nodes, like shifts of 0. We need to make sure we don't try to make long shifts of these, ending up with invalid nodes. A long shift with a zero immediate actually encodes a shift by 32.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
6013 ↗ | (On Diff #220517) | What happens if we discover the shift amount is zero after legalization? If MVE_LSLLi doesn't accept arbitrary immediates, the isel pattern should reflect that. (With only that fix, I think we still end up with an MVE_LSLLr, but that's not a correctness issue, just a missed optimization, I think.) |
Now using long_shift as an ImmLeaf.
I'm not sure how to test what will happen after legalisation. Any suggestions?
Just to verify the patch is doing what you think it is, you could hack the code.
We don't have any way to write isel tests where the input is a DAG. You could probably come up with some sequence which currently isn't folded until after type legalization, but it wouldn't really be reliable against future changes to DAG optimizations. I guess we could introduce an intrinsic that specifically turns into a constant after legalization for testing? But that's maybe overkill...
Maybe you could actually construct a test using GlobalISel? GlobalISel is at least partially implemented on ARM, but you'd need to introduce some way to express ARMlsll with GlobalISel, and I'm not sure how to do that, off the top of my head.
And as a fix for the problem here, and what I believe is a sensible fix for the long_shift pattern, does this patch look OK?
If you can't figure out how to write a test for the patterns once the optimization in Expand64BitShift is implemented, that's okay. LGTM
On a related note, why are we using a target-specific node here, as opposed to ISD::SHL_PARTS?
Thanks.
We did originally try to make this work with the SHL_PARTS nodes, and got quite far if my memory is correct. There was a certain amount of target independent code that was changed to keep them legal and prevent optimisations that we didn't want to happen from going off. My memory is fuzzy as to what the final showstopper was (if there really was one). Maybe something about treating LSRL as a LSLL with a negated operand, with the SRL_PARTS not there really being legal?