User Details
- User Since
- Feb 21 2018, 6:39 AM (265 w, 4 h)
Thu, Mar 9
Wed, Mar 8
add check on other place and add same check for globalisel
add same checks for globalisel
add same checks for globalisel
Tue, Mar 7
Move check next to isBaseWithConstantOffset, introduce helper function that can be used in other two patches also.
Rebase
Rebase
Thu, Mar 2
Wed, Mar 1
Tue, Feb 28
Tue, Feb 21
Feb 9 2023
Feb 8 2023
Feb 6 2023
Feb 1 2023
I like the GISelFlags, looks pretty lightweight.
Are you planning to add more flags?
About IgnoreCopies, it looks correct to me. Considering the possibility to break constant bus limit, patch synergies well with GISelPredicateCode that checks for it.
Can you add look through copies to Three Op Pats from VOP3 td file?
Upcoming regbankselect changes should be able to deal with copies but nevertheless I would like to see this one merged.
LGTM but I would wait for feedback from other reviewers.
Jan 31 2023
Addressed comments.
Jan 26 2023
Jan 24 2023
Jan 20 2023
This might not be related but what happens with control flow intrinsics that get preselected in legalizer?
Dec 14 2022
Implemented D139646 instead.
Dec 9 2022
Thanks for the review.
Dec 8 2022
Another attempt in D139646, I could to get anything better with tools available in InstPrinter.
OperandType would not make much difference (maybe for printing more verbose error for /*invalid immediate*/), I skipped adding it for now since it was possible to print some meaningful msg without it.
Dec 7 2022
In most cases if instruction can't use some bits in encoding of an operand, those bits are set to default value in td file. Disassembler fails (in generic way) in this cases and does not print e.g. "instruction must use glc" (assembler prints such verbose warnings)
Dec 6 2022
Dec 5 2022
Dec 2 2022
Nov 18 2022
Update test that I missed, move to test file with other fma intrinsics.
Name change for helper function, move call inside lambda body. Copies are now created when src operand is being added to the new instruction (MIB) and inserted before it.
Nov 17 2022
Remove build copy from selectVOP3ModsImpl.
Copy is now built when adding operands to selected instruction.
Nov 16 2022
I did not understand last comment. What do I need to do exactly?
Now selectVOP3PMadMixModsImpl (unchanged) is called when we know that fma/mad_mix will be selected.
Afaik select modifiers functions don't fail and are called once pattern was matched.
The tricky thing here is that selectVOP3PMadMixModsImpl was used to check for actual pattern (it needs to find fpext of some operand).
Now we check for fpext before selecting modifiers for operands.
I was checking if test like this gets selected in the same way:
This should be equivalent to trunk llvm but without creating copy instruction when fma_mix does not get selected.
Now that I checked in tests with madmix that has sgpr input, copy is not created but it should have been.
Patch is not correct at this state.
Is there a way to move
if (!MatchedSrc0 && !MatchedSrc1 && !MatchedSrc2) return false;
in selectG_FMA_FMAD before lookup for modifiers begin via selectVOP3PMadMixModsImpl?
Split getting the operand modifiers and potential copy construction.
Nov 15 2022
Nov 14 2022
Nov 8 2022
Nov 3 2022
Correction:
Change is caused by zextload being combined pre-legalizer
Change is caused by zextload NOT being combined pre-legalizer
Nov 2 2022
To my understanding prelegalizer now also requires legalizer info?
Oct 24 2022
Oct 21 2022
Rebase
Use Value finder to search for unmerged elements. In some tests skips a few combine steps since value finder looks through artifacts.
Use Value finder to search for unmerged elements. In some tests skips a few combine steps since value finder looks through artifacts.
Use Value finder to search for unmerged elements. In some tests skips a few combine steps since value finder looks through artifacts.
Oct 7 2022
You can take over. I am not so sure this will be useful as is. Just looking through copies will break constant bus limit. Then there are constants and their register banks.
I didn't like g_maybe_cross_reg_bank_copy it would be better to have some field in PatFrag recognized by the global-isel emitter (it would generate GIM_CheckOpcodeIgnoreCopies instead of GIM_CheckOpcode for all opcode checks in pattern).
But that can break constant bus limit. There is a mechanism to collect operands with
PredicateCodeUsesOperands = 1
Maybe you could add another field in PatFrag that will be a signal to generate jump-table entries that will go through collected operands and check constant bus limit for collected operands.
Just to clarify by have some field in PatFrag recognized by the global-isel emitter I meant something like HasNoUse but sdag emitter will ignore it.
Oct 5 2022
When to expand part LGTM.
For clarity, you could also check for Subtarget->hasLDSFPAtomicAdd() together with Subtarget->hasAtomicFaddRtnInsts() and Subtarget->hasAtomicFaddNoRtnInsts() to match feature description and instructions generated during expansion (It looks to me that expand assumes that target has ds_add).
Can you re-check tests? There should be some changes in llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll, also autogenerate llvm/test/CodeGen/AMDGPU/atomicrmw-expand.ll (btw it failed for me).
Oct 3 2022
Sep 30 2022
I understand now. Could I add a look through bitcast in findValueFromDefImpl? I would expect similar result as in D117328.
We do need to start moving all of this stuff into ArtifactValueFinder though
Like this?
As mentioned before there are some regressions here. I think that it would be best to:
not lower vgpr v2s16 build_vector_trunc in regbank select, and deal with all cases in instruction select
second element is undef -> first elemenent (this patch)
default case -> bit shift packing like in regbankselect
Just to point out most notable regression:
Sep 29 2022
Bitcast can be created during legalizer. About post legalizer combiner, I don't think we run artifact combiner there, If we leave it it might block some combines combines that would become available after bitcast is combined.
Does D134354 work for non vector cases?
My best guess is that it does not work for vectors of two elements where it would need to create two instructions (missing some build vector combine/legalizer rule). You would expect that all those build vectors in between are gone after legalizer?
You don't want (<2 x s16>) but only s16 G_FMAXNUM_IEEEwhen lowering larger vectors, also you remove G_BUILD_VECTOR_TRUNC but use G_BUILD_VECTOR instead.
Can you point me to the test case that requires those changes? (You didn't change legalization rules for other <2 x s16> instructions).
To be more precise what D134354 requires from Legalizer (and pre/post legalizer combiner). Maybe there is simpler way to achieve it.
Rebase
Rebase.