- User Since
- Jan 16 2017, 11:05 AM (127 w, 1 d)
May 13 2019
@amyk I see. Please let me know you need some help.
Apr 9 2019
Feb 5 2019
Jan 29 2019
Jan 8 2019
Dec 29 2018
BitPermutationSelector has been enhanced to generate better code for bitfield insert. So we do not need to specially handle bitfield insert anymore.
Dec 28 2018
Dec 27 2018
Dec 24 2018
Thank you for finding the problem. I do not remember my intention on checking SeenUse here.
Dec 14 2018
Nov 6 2018
Oct 17 2018
Oct 16 2018
Oct 15 2018
@AntonBikineev When I revoke this patch on my machine, the problem is gone.
I somewhat digged into the problem, although I still cannot catch the true reason.
- The problem is caused in the constructor of __shared_ptr called from std::_Construct. The call chain for the _Construct is std::__uninitialized_construct_buf, std::__stable_sort.
- When I disabled EarlyCSE in the __shared_ptr constructor, the problem does not happen.
If a constructor is called from std::Construct, may this pointer potentially alias with other memory references?
Oct 12 2018
@AntonBikineev It seems that this patch is causing failures in a ppc64 buildbot http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/ . Do you have any idea on the reason of the problem by looking at the errors?
Oct 10 2018
Sep 29 2018
@RKSimon Other colleagues in IBM are working on this and hopefully they will submit a patch separately. So I abandone this.
Thank you so much for your comments.
Sep 27 2018
Sep 26 2018
Sep 25 2018
Sep 24 2018
I'm just curious, is there something that runs after this pass that will clean up unreachable blocks if any became unreachable? For example, if MBB1 is the only predecessor of MBB2 and MBB2 is removed as a successor to MBB1, will something remove MBB2 afterwards.
Actually, no pass can eliminate unreachable block from this optimization. So I added a simple unrechable block elimination and modify a testcase for this.
Sep 21 2018
add -verify-machineinstrs options in unit tests
Sep 7 2018
@hfinkel do you have any further suggestions?
- rebased to the latest source (fix conflicts and update unit tests)
Aug 31 2018
LGTM with nitpicking.
Aug 29 2018
Does this code pattern frequently happen?
Aug 28 2018
rebased to the latest revision
This patch increases the number of record-form rotates we emit by more than 70%.
Do you mean reducing the record-form rotates by 70%? If so, it is great!
Aug 21 2018
I added a missing dependency in CMakeFiles for ARCMigrate to fix build break in build bot (http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/)
- check the explicit visibility attribute before changing the visibility to default
- add more tests
Aug 20 2018
Oh, thanks! I have not found this document.
Aug 8 2018
Aug 3 2018
Aug 2 2018
Does the comment about normalization only pertain to ISA 2.07?
As I browse the document, neither ISA 2.07 nor 3.0 mention about normalization by xscpsgndp.
P8 UM says xscpsgndp, xvcpsgndp and fmr are normalizing instruction. P9 UM say nothing.
Since fmr is also a normalizing instruction, I feel it is acceptable to use xscpsgndp for coping register.
Jul 31 2018
XSCPSGNDP has longer latency (6 cycles) than XXLOR (2 cycles) on POWER8 while it has higher throughput with the same latency on POWER9. So XXLOR is preferable for pre-P9.
Please add/adjust the tests with baseline checks as a preliminary step; we don't want to lose those in case the code change gets reverted.
I have updated baseline checks.
Jul 30 2018
- Separate the patch into two; this one is the first of the two.
- Add test cases with vector data type.
Jul 28 2018
Yes, I'd also like to see the tests get committed now with baseline CHECKs.
I have committed unit tests in https://reviews.llvm.org/rL338107
Jul 27 2018
Jul 26 2018
addressed the comments from @lebedev.ri
Jul 23 2018
- fix a bug with an integer larger than 64 bit
- add more test cases
- remove an unnecessary check
Jul 19 2018
So i guess my question is, what in instcombine does this fold?
Is it very general, and this is only one of the cases it handles?
If not, maybe it should be refactored into instsimplify.
For the shl->or->lshr case, instcombine first identifies or is redundant and eliminates it. Then shl shr pair is eliminated. I think it is not general compared to my code.
For the shl->or->and case, instcombine does more general but more costly analysis in SimplifyDemandedInstructionBits. The same analysis seems too costly, but I expand the scope of my code for non-boolean cases.
Jul 17 2018
update comments and test cases
Jul 16 2018
- add more test cases
- make the algorithm more general using m_c_Or instead of m_Or
Jul 9 2018
- rebase to the latest
- make the test cases more strict