The comment says to avoid the case where zero bits are shifted into the truncated value, but the code checks that the shift is smaller than the truncated value instead of the number of bits added by the sign extension. Fixing this allows a shift by more than the value size to be introduced, which is undefined behavior, so the shift is capped at the value size minus one, which has the expected behavior of filling the value with the sign bit.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
I couldn't tell what was happening in all these cases from the code or tests, so I added the tests with their current output here:
https://reviews.llvm.org/rL302475
I think there are 3 or more problems/opportunities here, so I want to separate these if possible. Can you make this patch just about fixing the miscompile in test91()? Is there a bug report for that case?
Note - Alive is very helpful to check that we're not either miscompiling or missing optimizations:
http://rise4fun.com/Alive/2I
All that needs to be changed to fix the test91 miscompile is changing ShiftAmt < ASize to ShiftAmt <= SExtSize - ASize. However, making this change actually introduces the test93 miscompile because the original code already assured that ShiftAmt < ASize (which is correct for avoiding a ub shift, but incorrect for avoiding pulling zero bits into the value). Note that by far the most common case of SExtSize >= 2*ASize (which is always true when all types are powers of 2) doesn't actually miscompile (because then ASize <= SExtSize - ASize anyway). I was planning on adding more optimization opportunities in a separate patch, but I'm not sure how to separate this change without introducing a miscompile. I could separate out the naming common subexpressions with variables if that helps.
Ok, your explanation makes this clearer. Since we're fixing a miscompile, I think it's ok to keep this patch mostly as-is. But please add some FIXME notes and rebase this patch after rL302475:
- This code is trying to do too much in one place IMO. It (and the related zext transform above it) should be split off into a helper function so that the (common?) case where "ASize" is the same as the final size (CI.getType()) are handled separately. When those types match, we do not need to check hasOneUse as strictly. We're also wastefully calling CastInst::CreateIntegerCast() in that case.
- We've artificially excluded vector types by using m_ConstantInt().
- As I showed in the Alive example, we can handle the case where (ShiftAmt > SExtSize - ASize) by adding an 'and' mask. This transform makes sense with appropriate one-use checking.
- We can use m_OneUse() with the match() calls to make the code cleaner.
I don't have commit access so if this is good to go could you commit it for me? Meanwhile, I'll work on actually cleaning up these two transforms.