- User Since
- Jan 9 2022, 5:09 PM (63 w, 4 d)
Fri, Mar 24
(1) Do not combine when store node generates additional value(s).
(2) Add getCombineStoreAndExtractIdx() to directly query the index, and do not need to loop through all elements.
(3) If the constant value to be stored is referenced by node other than current Store and the vector node, do not combine.
(4) Do not combine if the constant value is -1U.
(5) Add check to make sure the to-be-extracted vector type is legal.
Wed, Mar 22
Saw error in bootstrap.
Some test case to sanity check constants are stored as expected:
Tue, Mar 21
Mon, Mar 20
Thu, Mar 16
As verifier change and baseline test cases have been moved into https://reviews.llvm.org/D145767, update this patch.
Add empty case in Verifier/associated-metadata-aix-xcoff.ll
Tue, Mar 14
(1) Duplicated associated test cases to make copies for AIX XCOFF, and added multiple operands and null operand as legal cases.
(2) Updated verifier logic to check for AIX target.
(3) Moved XFAIL check to specific lines.
Mon, Mar 13
(1) Add test case to show how associated metadata will be used on AIX (currently not supported, marked with XFAIL).
(2) Add langref changes to explain.
I saw following case failed due to this commit with IPSCCPPass opt -passes="ipsccp"
Fri, Mar 10
Thu, Mar 9
Rebase and update patch
(1) Update verifier check on associated metadata to allow multiple operands for AIX.
(2) Update test cases to use opaque pointer.
Currently this patch does not work due to limit set on associated metadata operand count (forced to be single operand).
commit 87f2e9448e82bbed4ac59bb61bea03256aa5f4de Author: Matt Arsenault <Matthew.Arsenault@amd.com> Date: Mon Jan 9 12:17:38 2023 -0500
Wed, Mar 8
Attempt to push the logic into DAGCombiner::mergeConsecutiveStores()...
Thu, Mar 2
Update StoreSizeInBits check to skip on PowerOf2 bit size less than 8.
(1) Update element type ElemTy which now matches the type expected by both STFIWX and STXSIX PPCISD nodes.
(2) Add missing match pattern for PPCstxsix.
Wed, Mar 1
(1) Format code to following coding style guidance.
(2) Fix SplatValue check.
(3) Remaining redundant instructions like mtfprd will be fixed in separate patches.
Tue, Feb 28
Feb 28 2023
Plan to continue improve the patch...
Feb 27 2023
Update according to comments:
(1) Removed IsByValArg check which is redundant.
(2) Removed one test case scenario which is redundant.
Feb 21 2023
Feb 19 2023
Continue NIT: maybe remove the first else makes more sense...
Address comment and removed the second else.
Feb 16 2023
Redo the implementation, and now both memset and constant splat array initialization get changed.
Add one more case.
Feb 7 2023
I will try to get similar results by DAG combine. Thanks to Nemanja and Kai's insight!
Feb 6 2023
Feb 5 2023
Feb 2 2023
Jan 30 2023
Rebase && Gentle ping.
Jan 16 2023
Jan 13 2023
(1) Use new API to do the check.
(2) Remove assert requirement.
(3) Fixed test case: CHECK-PCREL should not be tailcall.
Jan 12 2023
As suggested by Zheng, I'm looking for a solution that reuse existing functions in PPCISelLowering.
Jan 8 2023
Plan to enable 256bit vector ld/st codegen, and then add the motivation case.
Jan 5 2023
Refresh and Ping.
Jan 3 2023
Update patch as following test case pattern changed:
Dec 26 2022
Dec 19 2022
Dec 14 2022
Dec 13 2022
Dec 12 2022
Dec 8 2022
Dec 7 2022
Dec 6 2022
Dec 2 2022
Test case update.
Add memset2TailV1Bx() cases which does not fit in vspltisb and may require constant pool.
Dec 1 2022
Nov 30 2022
Add memset-tail.ll changes.
Nov 29 2022
Changes in this update:
(1) I was trying to use TTI.getVectorInstrCost() to query instruction cost in PPCTargetLowering::canCombineStoreAndExtract(). However not able to reach TTI, and didn't find any reference to do that in SDAG. Given that the original implementation of canCombineStoreAndExtract() on ARM implemented its own logic to calculate Cost, followed the approach and implemented logic by referring to PPCTTIImpl::getVectorInstrCost().
Realized I need to use PPCTTIImpl::getVectorInstrCost() API to determine the cost of instructions. I'm working on it now.
Update according to comments:
(1) Use existing canCombineStoreAndExtract() instead of creating new.
(2) Nest the else statement properly.
(3) Saw two cases changed due to (1).
Add more test cases according to comment.