Aug 10 2016
Aug 5 2016
Ping. (Not sure who else would be appropriate to review this.)
Jul 31 2016
Might be nice to have a test which triggers this combine without going through legalization. (Should be possible, I think?)
Jul 30 2016
Overall, this is looking much better; simple is good. :)
Jul 29 2016
Oh, I wasn't really paying attention to that... I guess it is? If it doesn't work, that's fine. :) The tests look better now.
Yes, create a function for it. And in general, assume the answer to "should I create a function for this" is yes.
Oh, and yes, this version is what I was thinking of.
It would be nice to have a couple testcases involving numbers which aren't zero.
Jul 28 2016
I think you missed the point of my suggestion... as long as you can prove both sides have enough sign bits, it doesn't matter what kind of node you're dealing with.
You might want to consider special-casing division by a constant integer. For example, on x86, we can convert a 16-bit unsigned divide by a constant into a pmulhuw+psrlw.
Jul 27 2016
There may be another road if you think that would be better: we could keep the current loop rotation as is, and add a new loop rotation pass in a different file, and then deprecate the existing pass once everybody is convinced that the new code is better. Would that work for you?
LGTM... with the caveat that I'm not sure I completely understand how the AVX-512 boolean registers work.
Jul 26 2016
I'll try to outline a scenario where this misbehaves...
I don't see how unnamed_addr is relevant here. It means that the code won't compare the address of the global in question to the address of a different global. It doesn't have anything to do with comparisons involving different addresses within the same global.
It's similar to TBAA, I guess... TBAA tends to cause problems too; fno-strict-aliasing is popular for a reason. :)
(I'll refer to the new attribute as inrange to try to avoid confusion.)
Jul 25 2016
Jul 24 2016
Don't use PointerMayBeCapturedBefore. Add some comments.
This is taking so many iterations to get to a reasonable place that I think you need some other approach to testing; you clearly aren't finding the interesting cases. Have you considered some sort of test generator, which randomly generates known one and known zero masks, then exhaustively checks whether the end result is consistent?
Jul 23 2016
Jul 22 2016
InstrOrdering is basically an OrderedBasicBlock, I guess... I think I'd need to modify the interface a bit for this exact use to deal with modifications.
Jul 21 2016
Rebase on top of trunk.
You're missing an allOperandsAvailable call if Val is a GEP. (I'm not sure why you're even trying to special-case GEPs here.)
Re: the side-effect issue: You're trying to prove the loop will eventually either exit or execute undefined behavior. loopHasNoAbnormalExits proves that if the loop exits, it will exit via a visible edge (so it can't throw an exception or kill the program). There's a gap between the two: loopHasNoAbnormalExits doesn't care whether there's a nested infinite loop. In practice, this becomes an issue with atomic operations.
Needs more comments in the code, and more tests. In particular, you have zero tests of values with trailing zeros.
LGTM for now, I guess.
There, SCCP actually does add power to the transformation; consider a construct like the following:
The transformation valid... but I don't think it makes sense to perform it here. SCCP doesn't add any power to the transformation over instsimplify.
It looks like you're adding a bunch of bugpoint-reduced testcases without any comments... tests should at least explain what specifically they're trying to test.
Jul 20 2016
Sorry about the delay. LGTM.
Jul 19 2016
It looks like there are other places that look for bit-tests which are affected... see https://github.com/llvm-mirror/llvm/blob/dba9c3285e87768e80da6ca60f73b8913392dd98/lib/Transforms/InstCombine/InstCombineSelect.cpp#L579 , https://github.com/llvm-mirror/llvm/blob/65165b21da2aaed908f7bde1b8a16e7a8fe0c955/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp#L597 . Maybe we want a separate utility function to condense this functionality, so we don't end up with so many versions of it?
The x86-specific operation is affected by the rounding mode... but so is a C cast. This is specified by Annex F in the C standard.
I don't think we need to use x86-specific operations for sitofp-like conversions; the C cast is equivalent given that a 32 or 64-bit integer is always in within the range of a 32-bit float.
Jul 18 2016
Jul 14 2016
Oh, right, missed the ControlsExit. That doesn't help with the exception-throwing case, though.
You have to be a bit more careful here... consider:
Jul 13 2016
The semantics you've implemented seem extremely dangerous. You're making some very strong assumptions about how exactly the optimizer manipulates GEPs. I'm not sure whether the optimizer actually breaks this particular construct, but it easily could. The optimizer can and will dig through a ConstantExpr and transform the use into something different.
Jul 12 2016
The question isn't really what instructions we expect targets to have, but rather which version makes LLVM simpler overall. That said, if we assume almost every vector target has some sort of blend instruction, and therefore almost every target would need to pattern match the shuffle into a select, that would be a reason to prefer the select form, I guess.
I would lean towards making shufflevector canonical. It interacts better with existing transforms involving shuffles and vector insert/extract. I can't think of any benefit to making the select form canonical.
Is "zext(a_bool ^ true) ^ 2" canonical, or is "zext(a_bool) ^ 3"? Either way, you're missing a transform.
You probably want to refactor this if you're planning to use it from IPSCCP... but you can do that in the followup, I guess.
Jul 9 2016
Sure, that's fine (in a separate commit, of course).
I think you're misunderstanding what "isUndefined" means... it means "we haven't computed the value of this instruction yet", not "this value is an UndefValue". See https://www.cs.rice.edu/~keith/512/2011/Lectures/L19-SCCP-1up.pdf and http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.105.4146&rep=rep1&type=pdf .
Jul 8 2016
SCCP doesn't handle vectors. There's some code for it but it's #if 0'd . I re-enabled that code and tried to build and it passes test suite and it's able to self-host LLVM, so, I'm not entirely sure why that code is disabled and the comments in the code don't help :| Do you happen to know what's the reason?
Jul 7 2016
I guess it's a good idea to try to fold constants as we build them... it's slightly more efficient, and it could avoid spurious "overdefined" markings in some cases. Might as well make a pass over the whole file while you're at it to find other places where we do something similar.
SCCPSolver::visitCastInst already handles bitcasts... what are you trying to do?
Do we care at all about the number of uses of the operands?
Jul 6 2016
Please don't copy-paste code.
Are you sure you uploaded the right patch here? This isn't against trunk, as far as I can tell.
The code looks fine. The comments could use a bit of refinement.
Please ignore my previous comments, they were overly terse and not quite correct.