Page MenuHomePhabricator

fzhinkin (Filipp Zhinkin)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 18 2021, 3:11 PM (14 w, 4 d)

Recent Activity

Mon, Nov 22

fzhinkin added inline comments to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.
Mon, Nov 22, 10:11 AM · Restricted Project

Wed, Nov 17

fzhinkin added inline comments to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.
Wed, Nov 17, 12:32 PM · Restricted Project

Mon, Nov 8

fzhinkin added inline comments to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.
Mon, Nov 8, 8:47 AM · Restricted Project
fzhinkin updated the diff for D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Rebase

Mon, Nov 8, 8:28 AM · Restricted Project
fzhinkin updated the diff for D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Rebase

Mon, Nov 8, 7:10 AM · Restricted Project
fzhinkin added inline comments to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.
Mon, Nov 8, 6:55 AM · Restricted Project
fzhinkin added inline comments to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.
Mon, Nov 8, 1:06 AM · Restricted Project

Wed, Nov 3

fzhinkin updated the diff for D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Simplified expression in assertion.

Wed, Nov 3, 12:28 PM · Restricted Project
fzhinkin added inline comments to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.
Wed, Nov 3, 7:31 AM · Restricted Project
fzhinkin updated the diff for D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Cleanup

Wed, Nov 3, 7:28 AM · Restricted Project
fzhinkin added inline comments to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.
Wed, Nov 3, 4:26 AM · Restricted Project
fzhinkin added a comment to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

X86 changes look good. Are AArch64 changes actually a regressions, or are they just changes?

Thanks!
There is no regression for AArch64: better code sequence is generated for legalized i128 shifts, but there is no improvement for wider types (like i256).

A change that only triggers for some (common?) cases and causes no regressions is a better step forward
than a change that indiscriminately triggers on everything and causes widespread regressions i would think :)

Wed, Nov 3, 12:47 AM · Restricted Project

Tue, Nov 2

fzhinkin added a comment to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

X86 changes look good. Are AArch64 changes actually a regressions, or are they just changes?

Tue, Nov 2, 2:07 PM · Restricted Project
fzhinkin added a comment to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Instead I supported funnel shifts TargetLowering::optimizeSetCCOfExpandedShift (did not support rotations and bit/byte swaps because such nodes should not be created during expanded shift's combining).

While optimization works fine for i686 now there is an issue with AArch64: shifts expanded from types wider than i128 won't be optimized (see @opt_setcc_shl_ne_zero_i256) because for AArch64 funnel shift alike patterns combined into AArch64ISD::EXTR instead of FSHL/FSHR. I attempted to fix it by implementing (2), but the solution was fragile and didn't work in some cases.

I'm hoping D112443 will help with this

D112443 has been committed - please can you see if it helps?

Tue, Nov 2, 1:31 PM · Restricted Project
fzhinkin updated the diff for D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Rebase

Tue, Nov 2, 1:28 PM · Restricted Project

Mon, Nov 1

fzhinkin added a comment to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Handling of trees generated during legalization of i128/i256/etc to i32 is relatively simple, but in case of i686 some of these expressions are folded into funnel shifts before SimplifySetCC is applied to setcc.
I see two options here (but I'm far from being an expert, so correct me if I'm wrong and there are simpler alternatives):

  1. support various instructions (funnel shifts, rotations, bit/byte swaps) in TargetLowering::optimizeSetCCOfExpandedShift in addition to SHL/SRL;
  2. support only SHL/SRL in TargetLowering::optimizeSetCCOfExpandedShift and apply it in DAGTypeLegalizer::IntegerExpandSetCCOperands right after setcc's operands expansion.

Personally I'm leaning towards second option as it should be less fragile and easier to maintain.

Makes sense to try (2) first - although I expect at least partial support for (1) might end up being required - you are handling a pattern that is almost a funnel shift much of the time.

Mon, Nov 1, 12:26 PM · Restricted Project
fzhinkin updated the diff for D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Reimplemented expanded shift matching to handle funnel shifts.

Mon, Nov 1, 12:12 PM · Restricted Project

Oct 25 2021

fzhinkin added a comment to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

please can you pre-commit these new tests to trunk with current codegen and then rebase to show the diff?

I don't have a permission to commit changes, so I'll appreciate if you can help me with it. Here's the patch adding new tests with checks generated for current trunk:

Done (sorry for the delay) - please can you rebase?

Thank you! Rebase done.

Also, thanks for adding i686 to X86's test, it revealed that my optimization does not work when we're legalizing i128 to i32. I'll check if that case could be easily supported.

Oct 25 2021, 1:59 AM · Restricted Project

Oct 22 2021

fzhinkin added a comment to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

please can you pre-commit these new tests to trunk with current codegen and then rebase to show the diff?

I don't have a permission to commit changes, so I'll appreciate if you can help me with it. Here's the patch adding new tests with checks generated for current trunk:

Done (sorry for the delay) - please can you rebase?

Oct 22 2021, 8:17 AM · Restricted Project
fzhinkin updated the diff for D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Rebased to current trunk

Oct 22 2021, 8:12 AM · Restricted Project

Oct 20 2021

fzhinkin added a comment to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

please can you pre-commit these new tests to trunk with current codegen and then rebase to show the diff?

Oct 20 2021, 1:38 PM · Restricted Project
fzhinkin updated the diff for D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Fixed typos in ARM tests.

Oct 20 2021, 1:38 PM · Restricted Project
fzhinkin updated the diff for D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Renamed test files, added nounwind attribute.

Oct 20 2021, 12:09 PM · Restricted Project
fzhinkin added a comment to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Ping

Oct 20 2021, 12:52 AM · Restricted Project

Oct 11 2021

fzhinkin added inline comments to D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.
Oct 11 2021, 2:18 PM · Restricted Project
fzhinkin updated the diff for D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.

Fixed typos, cleaned up the code.

Oct 11 2021, 2:17 PM · Restricted Project
fzhinkin retitled D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0 from [TargetLowering] Optimize expanded SRL/SHL feeded into SETCC ne/eq 0 to [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.
Oct 11 2021, 3:15 AM · Restricted Project
fzhinkin requested review of D111530: [TargetLowering] Optimize expanded SRL/SHL fed into SETCC ne/eq 0.
Oct 11 2021, 3:12 AM · Restricted Project

Sep 12 2021

fzhinkin added inline comments to D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..
Sep 12 2021, 2:48 AM · Restricted Project
fzhinkin updated the diff for D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..

Fixed the test w/ icmp ne by preventing icmp's transformation.

Sep 12 2021, 2:37 AM · Restricted Project

Sep 11 2021

fzhinkin added inline comments to D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..
Sep 11 2021, 1:53 AM · Restricted Project
fzhinkin updated the diff for D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..

Updated comments, fixed few bugs.

Sep 11 2021, 1:52 AM · Restricted Project

Sep 10 2021

fzhinkin added inline comments to D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..
Sep 10 2021, 11:42 AM · Restricted Project
fzhinkin updated the diff for D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..

Rebased changes.

Sep 10 2021, 11:42 AM · Restricted Project

Sep 9 2021

fzhinkin added inline comments to D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..
Sep 9 2021, 1:15 PM · Restricted Project
fzhinkin added inline comments to D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..
Sep 9 2021, 11:48 AM · Restricted Project
fzhinkin updated the diff for D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..

Added comments to both transformation and tests, made multi-user test more readable.

Sep 9 2021, 11:44 AM · Restricted Project

Sep 5 2021

fzhinkin added a comment to D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..

ping

Sep 5 2021, 10:42 AM · Restricted Project

Aug 28 2021

fzhinkin updated the diff for D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..

Cleaned up the code, updated tests.

Aug 28 2021, 6:26 AM · Restricted Project
fzhinkin added inline comments to D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..
Aug 28 2021, 4:05 AM · Restricted Project

Aug 25 2021

fzhinkin updated the diff for D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..

Replace original mul with mul using frozen input.

Aug 25 2021, 2:17 PM · Restricted Project
fzhinkin updated the summary of D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..
Aug 25 2021, 11:45 AM · Restricted Project
fzhinkin updated the diff for D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..

Supproted undef values in both condition's and select's arguments.

Aug 25 2021, 11:45 AM · Restricted Project

Aug 24 2021

fzhinkin added a comment to D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..
  1. what's wrong with vectors?

There's nothing wrong, I'll support it, thanks!

  1. can't we preserve no-wrap flags?

Definitely.

  1. for vectors, what if some element is not zero but undef? (see Constant::mergeUndefsWith())

Not sure that I fully understand the issue.
If %a in the snippet below contains some undefs then the select may choose either 0 or %m depending on particular value of a (which could be 0 for undef elements).
Probably %a should be frozen to ensure that both icmp and mul will see the same a's value, but (if I understood it correctly) absence of freeze does not compromise the correctness.

define <2 x i4> @src(<2 x i4> %a, <2 x i4> %b) {
  %c = icmp eq <2 x i4> %a, zeroinitializer
  %m = mul <2 x i4> %a, %b
  %r = select <2 x i1> %c, <2 x i4> zeroinitializer, <2 x i4> %m
  ret <2 x i4> %r
}

Or you meant that the code could be folded even if icmp's RHS is not only the all-zeros vector, but a constant-vector containing some undefs?

I mean that if in either zero constant vector, a particular element is undef, then in the other constant vector, the same element can be anything:
https://alive2.llvm.org/ce/z/-gt253
https://alive2.llvm.org/ce/z/3PAU3E

So i think you want if(!match(Constant::mergeUndefsWith(TrueVal, CondVal.getOperand(1)), m_Zero)) return;

Aug 24 2021, 12:12 PM · Restricted Project
fzhinkin updated the diff for D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..

Enabled folding for vectors, copied flags for mul, renamed tests.

Aug 24 2021, 10:44 AM · Restricted Project

Aug 21 2021

fzhinkin added a comment to D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..
  1. what's wrong with vectors?
Aug 21 2021, 3:17 AM · Restricted Project

Aug 20 2021

fzhinkin retitled D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y). from [InstSimplify] Transform X == 0 ? 0 : X * Y --> X * freeze(Y). to [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..
Aug 20 2021, 2:51 PM · Restricted Project
fzhinkin added a comment to D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..

InstSimplify can not create new instructions
This needs to go into instcombine
While there, please let's create the multiply, it's less ugly.

Aug 20 2021, 2:39 PM · Restricted Project
fzhinkin updated the diff for D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..

Moved fold optimization to instcombine, updated tests.

Aug 20 2021, 2:38 PM · Restricted Project

Aug 19 2021

fzhinkin added reviewers for D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y).: RKSimon, majnemer, nikic.
Aug 19 2021, 2:13 PM · Restricted Project
fzhinkin requested review of D108408: [InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y)..
Aug 19 2021, 2:08 PM · Restricted Project