User Details
- User Since
- May 4 2016, 4:18 AM (360 w, 2 d)
Oct 28 2018
Unfortunately, the change that this test was designated to test never matured into a patch, so it was left as an orphan.
For the time being, you can even delete it altogether.
Aug 31 2018
Rebased and added lit tests
Jul 10 2018
Jul 9 2018
Jun 17 2018
Ping.
Rebasing the patch and adjusting it to the changes made in rL333689.
Removed special code for CastInst in InstructionSimplify::SimplifyWithOpReplaced as rL333689's changes handle that (in a different way).
May 28 2018
Several changes in this revision:
May 22 2018
Following this discussion, rebasing the patch to trunk.
Changing 'dyn_cast_or_null' to 'dyn_cast' in two places where the operator cannot be null.
You're right. I've originally missed the direct effect on InstSimplify, and I will change the tests location and structure if this change is accepted, but please also note that lib/Transforms/InstCombine/InstCombineSelect.cpp calls SimplifySelectInst which calls simplifySelectWithICmpCond which in turn calls SimplifyWithOpReplaced. Hence, InstCombine also benefits from this functional change.
But this example doesn't need nsw/nuw, so what is this test trying to demonstrate?
https://rise4fun.com/Alive/9zX
Exactly that. In the current status, InstCombine and InstSimplify will not perform this transformation despite it's correctness. Furthermore, if you remove the nsw/nuw flags from the 'shl' in this example, InstCombine will first identify that the 'shl' really has no signed and unsigned wraps and will add the flags before the 'select' had the chance to optimize, which will result in the same un-transformed code. This is exactly one of the missed transformation opportunities I'm trying to cease.
Also, this is still using ValueTracking, so I'm not comfortable continuing this review until we have more data about the cost and benefits.
You're right that ValueTracking is still used, but running this change on a large set of benchmark did not yield any noticeable compile-time regressions. In addition, the two functions from ValueTracking that I'm using (ComputeNumSignBits and MaskedValueIsZero) are already in use in InstructionSimplify.cpp.
May 14 2018
The title was changed. However, those changes do not affect InstSimplify directly, as they are not used by this pass. The only functional change is in InstCombine, which uses these parts of InstSimplify as a library.
After a close look I discovered some compile-time regressions, as you predicted. I removed the safety checks for 'add', 'sub' and 'mul', which require ValueTracking (so now there are only safety checks for 'shl', 'ashr' and 'lshr'). That elimintated those regressions.
Removing safety checks for 'add', 'sub' and 'mul', since they require ValueTracking which proved to be too expensive in compile time.
May 10 2018
Ping.
May 3 2018
May 2 2018
The required changes were made in D45731
May 1 2018
Apr 27 2018
What are your plans for deeper associate-operand trees, such as in the tests I've suggested at D41574? Will you support such cases?
Apr 19 2018
Restoring the ownership of this revision.
Apr 17 2018
Jan 23 2018
Ping
Ping
Ping
Jan 18 2018
Jan 10 2018
Ping
Ping
Ping
Dec 27 2017
Dec 26 2017
Now using the constant's value rather than its pointer value for enforcing total order on LeafClasses. This is needed in order to ensure deterministic output of the compiler (I actually got different results when running with different OSs before this change).
Dec 25 2017
Dec 21 2017
- Generalized InstructionSimplify's IsSafeOverflowingBinaryOperator to handle 'add', 'sub' and 'mul' in addition to 'shl'. This was done by refactoring existing behavior out of InstCombine's internals into ValueTracking, so now both InstCombine and InstructionSimplify use it.
- Added tests to cover possible cases. For each OverflowingBinaryOperator ('shl', 'add', 'sub' and 'mul') I've created tests according to the Cartesian product of four factors: the op has the nuw flag, the op has the nsw flag, there actually is an unsigned overflow, there actually is a signed overflow. The key idea is to check that optimizations occur if and only if the overflow flag is absent or no overflow is guaranteed. also note that the file name had changed from select-bitext-bitwise-ops.ll to select-obo-peo-ops.ll to better reflect the nature of the tests it contains.
Dec 19 2017
Added a comment explaining the change of pattern in the bswap idiom detection.
Dec 18 2017
Rebasing on top of the new parent review D41354
- Relaxing the canonization condition: the masked shl will be canonized to the new form only if the 'shl''s 0th operand is not a shift instruction. This is due to other, better optimization that will be able to kick in.
- Reverted the InstructionSimplify and bswap changes. Those will be added in two different reviews.
Dec 14 2017
Rebasing on parent revision and adding more test cases.
Dec 5 2017
Nov 28 2017
Could you please explain why? I'm not sure I'm seeing it.
Currently, we end up inverting the canonicalization in the x86 backend (because a smaller constant mask can be created in less instruction bytes), so it would be better to "get it right" here in IR in the first place.
I understand the multi-use case better now with your explanation, so I agree that we want this patch to handle those cases too. But I don't think we should ignore the underlying canonicalization choices just because we know we want to catch the larger patterns.
I agree that this alternative canonization could prove to be beneficial and more correct. However, I feel that this discussion is orthogonal to this patch, and if it would indeed be decided to switch to the new form then some of the code of this patch, along with several other pieces of code, may need to change accordingly.
Nov 26 2017
Nov 20 2017
Nov 13 2017
Replacing std::make_unique with llvm::make_unique.
Adding "skylake-avx512" to the architectures which the optimization will be enabled for.
Now using MCInstrDesc via MCInstrInfo instead of checking for specific opcodes.
Checked in the test and rebased to make changes more clear.
Added a negative test.
Fixed some typos.
Nov 12 2017
Adding a pattern match for a shift of and ((V&C1)<<C2).
Although and of shift is the canonical form, this new form is also required in some cases. The new test multiuse3 demonstrate such a case.
Nov 9 2017
Oct 30 2017
Oct 25 2017
Oct 24 2017
- Removed the lambda MatchShiftOfAnd as Shift of And is canonicalized to And of Shift, so the lambda MatchAndOfShift is sufficient.
- The lambda MatchAndOfShift is now not static.
- Removed unnecessary calls to replaceAllUsesWith.
- Fixed test documentation.
Oct 23 2017
Oct 16 2017
- MCAsmBackend c'tor now takes a unique_ptr of MCCodePadder instead of a raw pointer
- MCCodePaddingContext has a new field: IsBasicBlockReachableViaBranch
- Changed names of a function and several variables
Sep 19 2017
Aug 20 2017
Aug 17 2017
Aug 1 2017
Ping
Ping
Jul 12 2017
Removed unwanted dependencies introduces in my previous patch: MC layer no longer depends on CodeGen, Core or Target. This was achieved by introducing a new structure named MCCodePaddingContext that is being created by AsmPrinter and passed on to the MC layer (all the way to MCCodePadder). This replaces the direct use of TargetMachine, MachineBasicBlock, MachineFunction and MachineLoopInfo by the MCCodePadder.
Note: This time I used "svn diff -x -U999999" to create the patch, so comparison between this patch and the previous one may prove to be challenging (This had to be done for future work and review).
Removed unwanted dependencies introduces in my previous patch: MC layer no longer depends on CodeGen, Core or Target. This was achieved by introducing a new structure named MCCodePaddingContext that is being created by AsmPrinter and passed on to the MC layer (all the way to MCCodePadder). This replaces the direct use of TargetMachine, MachineBasicBlock, MachineFunction and MachineLoopInfo by the MCCodePadder.
Note: This time I used "svn diff -x -U999999" to create the patch, so comparison between this patch and the previous one may prove to be challenging (This had to be done for future work and review).
Jun 26 2017
Jun 21 2017
Should we expose this to the .s?
That would make it easy to test codegen because you would just need to
check for things like .pad_bb_begin.It would make it easier to test the layout as you could use a simple .s
file for that.Cheers,
Rafael
Jun 20 2017
May 17 2016
Added 32-bit test for the intrinsics