- User Since
- Feb 21 2018, 6:39 AM (174 w, 1 d)
Thu, Jun 17
Merge with G_FCONSTANT case above.
Fix cases of SNaN and QNaN constant splats padded with undef, also add test for them.
Add one test for splat padded with undef with ieee=true and dx10_clamp=true.
Move to isCanonicalized and post-legalizer combiner.
Wed, Jun 16
This is required to match splat that was padded with undef into clamp D90052. When undef is used to pad with element that does not affect the result we interpret it in a way to clamp other elements.
May 10 2021
May 7 2021
Can you also add an end to end IR test for this case
No longer possible because of D101962 (also was way too large for lit test).
I did find another test, it needs -mattr=+unaligned-access-mode and align 1 to pass through legalizer and get vgpr dst + sgpr address.
Test split for precommit, found another test for end to end run.
May 6 2021
This was detected on gfx7 on large test (few hundred lines) with many uniform loads.
Loads in first ~200 lines had amdgpu.noclobber but after some threshold they did not.
AnnotateUniformValues was relying on MemoryDependenceResults::getSimplePointerDependencyFrom which gives up at some point and returns MemDepResult::getUnknown() resulting in amdgpu.noclobber not being set on address. Such load will be selected using vgpr destination (and sgpr address on gfx7).
In the case of mentioned test, amdgpu.noclobber not being set is fixed by D101962. Thus only mir test.
Apr 29 2021
Apr 28 2021
Apr 27 2021
Apr 26 2021
Add unit test.
Removed matchers that don't check opcode.
My guess is that this was originally done as a equivalent of IR/PatternMatch.h.
Thus i also added AnyBinaryOp_match alongside with the version that checks opcode which i originally wanted(named BinaryOpc_match here).
I thought that AnyBinaryOp_match would be useful when we know opcode already so we avoid checking it again, but turns out that at the moment this is not that useful.
I would prefer to leave existing code with opcodes as template arguments like in IR/PatternMatch.h and add new matcher that has opcode as an regular argument instead. m_Add ect. are meant to have templated opcode and changing it to regular opcode argument almost guaranteed results in a same thing but brings no improvement.
Addressed review comments.
Rebase and ping.
Simplify tests that fail because of non-binary instruction (match register instead of m_ICst) since they fail when number of def/use operands is checked. Update commit message.
Apr 8 2021
Added tests for instructions with 0 and 2 defs.
Apr 2 2021
Apr 1 2021
Fixed G_FCONSTANT and added test for it.
Extracted MIPatternMtach changes.
Mar 31 2021
Added basic version of isCanonicalized for global-isel. Copied from sdag.
Rebased. Updated uniform test (Jay fixed reg-bank-select for uniform min and max) and added test for v2i16 which we don't combine.
Renamed G_AMDGPU_MED3 to G_AMDGPU_SMED3 since this patch also adds UMED3.
Mar 29 2021
Mar 24 2021
Mar 12 2021
Looks good, I would like for other reviewers to also take a look.
Mar 5 2021
Mar 4 2021
Mar 3 2021
Removed isVector checks for non-Vector opcodes. Use Register instead of unsigned.
Mar 2 2021
Add a few unit tests and break for vector types on some opcodes.
Feb 23 2021
There is no need for helper state class.
Dropping icmp move for from this patch. Leaving zext_trunc_fold.
Zext is selected into AND with 1. zext_trunc_fold results in getting rid of the SCC copies when zext was the only instruction between icmp and select/branch.
Feb 18 2021
Feb 12 2021
Feb 8 2021
Use zext_trunc_fold from generic combiner to separately fold all cases of zext(trunc x) -> x made by regbankselect.
icmp move before select/brcond has to be aware of current state of MF since we run combines top-down and instructions (trunc) can be left without uses (zext was deleted by zext_trunc_fold)
Feb 5 2021
Adding non-splat test.
isKnownToBeAPowerOfTwo ends up checking known bits for build_vector and fails for non-splat values.
Feb 4 2021
Feb 2 2021
Handle some cases with many uses. Adding icmp fold without move for the case when we can't move icmp because code looks nicer in the case with more than one use.
Feb 1 2021
Addressed review comments.