User Details
- User Since
- May 16 2016, 11:34 AM (384 w, 2 d)
Aug 14 2023
Jul 26 2023
Jul 24 2023
Jul 21 2023
Oops C++ fail and missed some tests.
Jul 20 2023
Jan 5 2023
Dec 14 2022
Dec 12 2022
Dec 7 2022
There is an existing place in codegen to handle forwarding a store value to the corresponding load - ForwardStoreValueToDirectLoad in lib/CodeGen/SelectionDAG/DAGCombiner.cpp. Handling this in codegen would not require putting the code in multiple places. The code there is relatively simple - it just looks to see if the thing on the chain immediately before the load is the setting store. For the first case in byval-lhs.ll there is a CALLSEQ_START on the chain in between the load and the store. Maybe it is possible to look past that in the chain to see the store. Since the load is for a call there may be a register copy required to replace the load. Where there is a sequence of multiple stores followed by multiple loads it would require looking back in the chain past loads and past stores to fixed stack locations that do not overlap. I don't know if that is allowed, but in theory it could work.
Nov 14 2022
LGTM
Nov 9 2022
LGTM
Nov 8 2022
Nov 7 2022
Oct 3 2022
LGTM
Sep 14 2022
Aug 9 2022
Update diff to show diffs with new tests.
Aug 8 2022
Aug 5 2022
@t.p.northover , any concerns?
Jul 25 2022
Jul 21 2022
Forgot to include the new tests.
Jul 20 2022
Jul 19 2022
Dec 1 2021
LGTM
Nov 30 2021
Nov 29 2021
Nov 1 2021
Oct 28 2021
Oct 20 2021
Oct 18 2021
Oct 15 2021
Is this still needed?
Oct 12 2021
Oct 7 2021
Jan 22 2021
LGTM
Jan 21 2021
Test case?
Aug 24 2020
Aug 11 2020
Ping?
Jul 22 2020
Resolve comment.
Resolve some comments.
Improved testing and tightened checks.
Jun 16 2020
LGTM
May 19 2020
May 15 2020
May 14 2020
Oct 15 2019
Update test to use CHECK-COUNT.
Sep 27 2019
Moved the tests to a separate file and added hand-written checks which include the permute control data.
Sep 25 2019
Aug 26 2019
Aug 23 2019
Jul 18 2019
Jun 27 2019
Jun 24 2019
Jun 18 2019
Generated test checks with script.
Add X86 test.
Jun 17 2019
May 13 2019
The following file and command should reproduce the failure we are seeing.
May 6 2019
This revision causes a traceback when compiling SPEC2017 523.xalanbmk_r with -O3 -m64 -mcpu=power9 -flto and PGO, on the -fprofile-use compile step for XMLDateTime.cpp. Can you please take a look? Let me know if don't have access to SPEC0217 source and need a reproducer. The traceback was as follows:
Apr 29 2019
Apr 26 2019
Apr 18 2019
Apr 2 2019
Apr 1 2019
Feb 11 2019
Feb 8 2019
Update diff to show test changes and respond to comments.
Feb 7 2019
Feb 6 2019
Jan 25 2019
@nemanjai Yes, please, I do need it committed for me. Last time I promise!
Deleted commented out code and added suggested comments.
Jan 17 2019
Added a new TTI method vectorCostAdjustment to consolidate and make uniform for all instruction types the checks and cost modification. Also added direct cost model test.
Jan 11 2019
For memory ops it should be the same as arithmetic. The LSUs are a separate resource from the slices, but a vector load or store still consumes multiple LSUs (2x if aligned, 3x if not). I don't follow why there should be a problem with shuffle - I assume a shuffle will require one or more vector ALU ops.
Jan 9 2019
I don't think adding a new TTI function is necessary, as I think the way I am modeling the costs here is what is expected. The following comment (from TargetTransformInfo.h: getArithmeticInstrCost()) may shed some light:
Fix comment.
Dec 18 2018
Dec 7 2018
Nov 21 2018
Nov 20 2018
Nice! Glad to see this stuff get completed.
Oct 23 2018
Address review comments - change variable names, support/test big-endian.
Oct 16 2018
Aug 15 2018
Changed == NULL to == nullptr.
Aug 14 2018
Update to access constant shift amount by dynamic cast and to reverse if from block form to early exit.
Aug 8 2018
Updated to incorporate review comments.
Aug 2 2018
Jul 26 2018
Aug 24 2016
Unlike the php test failure, which is dependent on the library memcmp behaviour and fails for both clang and gcc, the postgres test failure only happens with clang. The uuid regression test fails for clang, and the failure goes away if src/backend/utils/adt/uuid.c is compiled with gcc. The issue is the result for the uuid_internal_cmp function, which is just a 16 byte memcmp. The address of the function is stored in a table of builtins and only called by address, and the complexity of the application and test environment make it difficult to trace back to where this function is called, which may be many places. It might be desirable to just be compatible with gcc. This diff updates the approach to use the gcc-type IPM/SLL/SRA sequence. The sequence is first translated into a SELECT_CMP operation. This makes it easier to perform the memcmp compare to zero optimization (SRA kills the CC). It also should make it easier to add support for LOCHI, since the compare to zero code can be shared, and to get the promotion to 64-bit case with shared code.