RKSimon (Simon Pilgrim)
User

Projects

User does not belong to any projects.

User Details

User Since
May 5 2014, 7:26 AM (180 w, 4 d)

Recent Activity

Today

RKSimon committed rL316226: [X86][SSE] Add missing _mm_extract_ps fast-isel test.
[X86][SSE] Add missing _mm_extract_ps fast-isel test
Fri, Oct 20, 12:31 PM
RKSimon created D39135: [X86][SSE] Add extractps/pextrd equivalence to domain tables.
Fri, Oct 20, 12:24 PM
RKSimon created D39134: [X86][SSE] Add MOVHPSrm to domain tables.
Fri, Oct 20, 12:05 PM
RKSimon committed rL316222: [X86][SSE] getTargetShuffleMask - check shuffle input value types. NFCI..
[X86][SSE] getTargetShuffleMask - check shuffle input value types. NFCI.
Fri, Oct 20, 11:09 AM
RKSimon committed rL316217: [X86] Check all CPU target names. .
[X86] Check all CPU target names.
Fri, Oct 20, 9:57 AM
RKSimon added a comment to D39051: [X86][F16C] Update instruction scheduling on btver2.

These latencies/throughputs still don't match the AMD docs - please match those and not the Agner tests

Fri, Oct 20, 9:37 AM
RKSimon added a comment to D39046: [X86][SSE41] [AVX]Update instruction scheduling on btver2.

To rename as

[X86][SSE41][AVX] Update instruction scheduling on btver2

Fri, Oct 20, 9:35 AM
RKSimon accepted D38967: [SelectionDAG] Don't subject ISD:Constant to the depth limit in TargetLowering::SimplifyDemandedBits..

LGTM - worth doing the same for SelectionDAG::ComputeNumSignBits and SelectionDAG::computeKnownBits ?

Fri, Oct 20, 7:14 AM
RKSimon committed rL316213: [X86][AVX512] Regenerate regcall tests..
[X86][AVX512] Regenerate regcall tests.
Fri, Oct 20, 7:13 AM

Yesterday

RKSimon committed rL316176: [X86][AES] Test AES intrinsics on 32/64-bit targets with/without VEX encoding.
[X86][AES] Test AES intrinsics on 32/64-bit targets with/without VEX encoding
Thu, Oct 19, 12:05 PM
RKSimon committed rL316162: [X86] Replace custom scalar integer absolute matching with ISD::ABS lowering..
[X86] Replace custom scalar integer absolute matching with ISD::ABS lowering.
Thu, Oct 19, 8:05 AM
RKSimon closed D38895: [X86] Replace custom scalar integer absolute matching with ISD::ABS lowering. by committing rL316162: [X86] Replace custom scalar integer absolute matching with ISD::ABS lowering..
Thu, Oct 19, 8:05 AM
RKSimon committed rL316161: Fix MSVC signed/unsigned comparison warning.
Fix MSVC signed/unsigned comparison warning
Thu, Oct 19, 8:00 AM
RKSimon committed rL316160: [X86] Add scalar (abs (abs x)) -> (abs x) combine test..
[X86] Add scalar (abs (abs x)) -> (abs x) combine test.
Thu, Oct 19, 8:00 AM
RKSimon added a comment to D39077: [X86] Teach the assembly parser to warn on duplicate registers in gather instructions..

Should we test for cases where the index register is a different size to the mask/result? Like _mm256_i32gather_epi64?

Thu, Oct 19, 7:33 AM

Wed, Oct 18

RKSimon added a comment to D39046: [X86][SSE41] [AVX]Update instruction scheduling on btver2.

I realise this was for SSE41 instructions, but given that its just dot product ops, it might be better to rename it and add the VDPPSY cases as well?

Wed, Oct 18, 7:47 AM
RKSimon added inline comments to D39054: [X86][Broadwell] Added the instruction scheduling information for the Broadwell CPU..
Wed, Oct 18, 7:42 AM
RKSimon added inline comments to D39051: [X86][F16C] Update instruction scheduling on btver2.
Wed, Oct 18, 7:25 AM

Tue, Oct 17

RKSimon committed rL316033: [X86][SSE] Tests packuswb/truncation codegen from PR34773.
[X86][SSE] Tests packuswb/truncation codegen from PR34773
Tue, Oct 17, 2:15 PM
RKSimon committed rL316017: [DAGCombine] Add SCALAR_TO_VECTOR undef handling to simplifyShuffleMask..
[DAGCombine] Add SCALAR_TO_VECTOR undef handling to simplifyShuffleMask.
Tue, Oct 17, 11:15 AM
RKSimon updated the diff for D38696: [DAGCombine] Permit combining of shuffle of equivalent splat BUILD_VECTORs.

Updated to reuse more of the existing BUILD_VECTOR code.

Tue, Oct 17, 10:58 AM
RKSimon accepted D38994: [X86][Broadwell] Added the broadwell cpu to the scheduling regression tests.<NFC>.

LGTM

Tue, Oct 17, 5:39 AM

Mon, Oct 16

RKSimon committed rL315955: [X86][AVX] Add v4x64 vector shuffle test for <0,2,1,3> mask.
[X86][AVX] Add v4x64 vector shuffle test for <0,2,1,3> mask
Mon, Oct 16, 4:20 PM
RKSimon committed rL315942: [X86][3DNow] Add scheduling latency/throughput tests for 3DNow! instructions.
[X86][3DNow] Add scheduling latency/throughput tests for 3DNow! instructions
Mon, Oct 16, 2:55 PM
RKSimon committed rL315939: [X86][MMX] Add scheduling latency/throughput tests for MMX instructions.
[X86][MMX] Add scheduling latency/throughput tests for MMX instructions
Mon, Oct 16, 2:29 PM
RKSimon added a comment to D38967: [SelectionDAG] Don't subject ISD:Constant to the depth limit in TargetLowering::SimplifyDemandedBits..

Doesn't look like SimplifyDemandedBits or computeKnownBits currently handling ConstantFP. We probably don't cross any fp->integer boundaries when recursing. I definitely see an early out in ISD::BITCAST handling if its a cast from FP.

Mon, Oct 16, 2:19 PM
RKSimon accepted D38727: [X86][SKL] Updated scheduling information for the SkylakeClient target.

LGTM. I'm intending to add MMX scheduling tests shortly so if they land before this you may need to rebase + regenerate.

Mon, Oct 16, 10:57 AM
RKSimon committed rL315907: Fix test name typo..
Fix test name typo.
Mon, Oct 16, 7:34 AM
RKSimon committed rL315906: [X86][SSE] Added additional PACKUS shuffle tests.
[X86][SSE] Added additional PACKUS shuffle tests
Mon, Oct 16, 7:32 AM
RKSimon committed rL315903: Fix or vs || typo..
Fix or vs || typo.
Mon, Oct 16, 7:02 AM
RKSimon added inline comments to D38732: [X86][AVX512] Improve lowering of AVX512 test intrinsics.
Mon, Oct 16, 5:47 AM

Sat, Oct 14

RKSimon committed rL315826: [TableGen] Avoid unnecessary std::string creations.
[TableGen] Avoid unnecessary std::string creations
Sat, Oct 14, 2:28 PM
RKSimon committed rL315825: [X86][SSE] Don't attempt to reduce the imul vector width of odd sized vectors….
[X86][SSE] Don't attempt to reduce the imul vector width of odd sized vectors…
Sat, Oct 14, 12:57 PM
RKSimon committed rL315824: [X86][SSE] Test vector imul reduction on 32 and 64-bit targets.
[X86][SSE] Test vector imul reduction on 32 and 64-bit targets
Sat, Oct 14, 12:46 PM
RKSimon committed rL315818: Pull out repeated calls to VT.getVectorNumElements(). NFCI..
Pull out repeated calls to VT.getVectorNumElements(). NFCI.
Sat, Oct 14, 10:38 AM
RKSimon committed rL315817: Cleanup update_llc_test_checks.py notes..
Cleanup update_llc_test_checks.py notes.
Sat, Oct 14, 10:37 AM
RKSimon committed rL315815: Use DAG::getBitcast() helper. NFCI..
Use DAG::getBitcast() helper. NFCI.
Sat, Oct 14, 10:14 AM
RKSimon committed rL315807: [X86][SSE] Support combining AND(EXTRACT(SHUF(X)), C) -> EXTRACT(SHUF(X)).
[X86][SSE] Support combining AND(EXTRACT(SHUF(X)), C) -> EXTRACT(SHUF(X))
Sat, Oct 14, 8:01 AM
RKSimon added inline comments to D33099: [X86] Model 256-bit AVX instructions in the AMD Jaguar scheduler (PR28573).
Sat, Oct 14, 7:07 AM

Fri, Oct 13

RKSimon added a reviewer for D38890: [X86] Add FeatureSlowBTMem to Haswell, Broadwell, Skylake, Cannonlake, and Knights Landing CPUs.: gadi.haber.
Fri, Oct 13, 10:29 AM
RKSimon created D38895: [X86] Replace custom scalar integer absolute matching with ISD::ABS lowering..
Fri, Oct 13, 10:25 AM
RKSimon committed rL315711: [X86] Test scalar integer absolutes on 32-bit targets with/without CMOV.
[X86] Test scalar integer absolutes on 32-bit targets with/without CMOV
Fri, Oct 13, 10:09 AM
RKSimon committed rL315706: [X86] Updated scalar integer absolute tests to cover i8/i16/i32/i64.
[X86] Updated scalar integer absolute tests to cover i8/i16/i32/i64
Fri, Oct 13, 9:53 AM
RKSimon resigned from D38128: Handle COPYs of physregs better (regalloc hints).
Fri, Oct 13, 9:08 AM
RKSimon accepted D38836: Use X86ISD::VBROADCAST in place of v2f64 X86ISD::MOVDDUP when AVX2 is available.

LGTM

Fri, Oct 13, 9:07 AM
RKSimon accepted D38811: [x86] Add initial skeleton support for "knm" cpu - llvm version.

LGTM with one minor

Fri, Oct 13, 9:05 AM
RKSimon added inline comments to D38836: Use X86ISD::VBROADCAST in place of v2f64 X86ISD::MOVDDUP when AVX2 is available.
Fri, Oct 13, 7:10 AM
RKSimon added inline comments to D38714: [AVX512] Don't mark EXTLOAD as legal with AVX512. Continue using custom lowering..
Fri, Oct 13, 7:08 AM
RKSimon added inline comments to D38811: [x86] Add initial skeleton support for "knm" cpu - llvm version.
Fri, Oct 13, 7:03 AM
RKSimon accepted D38813: [X86] Add skeleton support for "knm" cpu - clang side.

LGTM

Fri, Oct 13, 6:58 AM
RKSimon accepted D38664: [X86] Stop creating CMOV nodes with a second MVT::Glue result.

LGTM

Fri, Oct 13, 6:51 AM
RKSimon accepted D36706: DAGCombiner: Add form of isFPExtFree to check uses.

LGTM - naturally D38510 needs updating the new API

Fri, Oct 13, 4:00 AM

Thu, Oct 12

RKSimon added a comment to D38495: [X86] Fix bug in legalize vector types - Split large loads.

Possibly add a test case for PR34657 as well? @zvi has already reduced much of it

Thu, Oct 12, 3:03 PM
RKSimon added inline comments to D38696: [DAGCombine] Permit combining of shuffle of equivalent splat BUILD_VECTORs.
Thu, Oct 12, 11:43 AM
RKSimon accepted D38781: [X86] Add CLWB intrinsic. clang part.

LGTM - one minor

Thu, Oct 12, 11:41 AM
RKSimon added inline comments to D38811: [x86] Add initial skeleton support for "knm" cpu - llvm version.
Thu, Oct 12, 10:39 AM
RKSimon added inline comments to D38811: [x86] Add initial skeleton support for "knm" cpu - llvm version.
Thu, Oct 12, 9:25 AM
RKSimon added inline comments to D38781: [X86] Add CLWB intrinsic. clang part.
Thu, Oct 12, 9:01 AM
RKSimon committed rL315587: [X86][SSE] Pull out repeated INSERT_VECTOR_ELT code from LowerBUILD_VECTOR….
[X86][SSE] Pull out repeated INSERT_VECTOR_ELT code from LowerBUILD_VECTOR…
Thu, Oct 12, 8:52 AM
RKSimon added a comment to D38832: [X86][SelectionDAG] Add support for simplifying demanded bits of target nodes. Use it to simplify demanded bits of CMOV.

A few comments but nice to see this being pursued - not sure if this will need breaking down into incremental changes at commit time.

Thu, Oct 12, 3:36 AM

Wed, Oct 11

RKSimon added inline comments to D36691: [AsmParser] Add DiagnosticString to register classes in tablegen.
Wed, Oct 11, 9:16 AM
RKSimon committed rL315471: Spelling mistake in comment. NFCI..
Spelling mistake in comment. NFCI.
Wed, Oct 11, 9:10 AM
RKSimon added inline comments to D38771: [x86] avoid infinite loop from SoftenFloatOperand (PR34866).
Wed, Oct 11, 3:46 AM
RKSimon added inline comments to D38732: [X86][AVX512] Improve lowering of AVX512 test intrinsics.
Wed, Oct 11, 3:31 AM
RKSimon added a comment to D38318: [X86][SSE] Match PSHUFLW/PSHUFHW + PSHUFD vXi16 shuffle patterns (PR34686).

@delena @zvi What do you want to do with this. IMO we shouldn't be prematurely combining to variable mask shuffles, and this should be performed later as a scheduler based decision. But that will involve a lot of work that I don't think we're ready for (D26855 tried to move some other code to the MC and hit a lot of issues).

What we could do is add a FeatureFastVariableShuffle feature flag to Haswell and later Intel CPUs and perform the decision in combineX86ShuffleChain off that?

May be just add something like this:
bool hasVariableShuffle(MVT Ty) {

if ((hasAVX2() && Ty == XXX) || hasAVX512() && Ty == YYY)
  return true;

return false;

Wed, Oct 11, 3:10 AM
RKSimon accepted D38784: [X86] Remove MVT::i1 handling code from LowerTRUNCATE.

LGTM

Wed, Oct 11, 1:51 AM

Tue, Oct 10

RKSimon added inline comments to D38693: [SLP] Consider extractelements as shuffles iff they have the same type/parent etc..
Tue, Oct 10, 1:15 PM
RKSimon added inline comments to D38697: [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle..
Tue, Oct 10, 1:09 PM
RKSimon added inline comments to D38756: [x86] use an insert op to put one variable element into a constant of vectors.
Tue, Oct 10, 1:00 PM
RKSimon added inline comments to D38727: [X86][SKL] Updated scheduling information for the SkylakeClient target.
Tue, Oct 10, 12:56 PM
RKSimon added a comment to D38736: [X86] test/testn intrinsics lowering to IR. llvm part..

*-fast-isel.ll tests to match the clang *-builtins tests?

Tue, Oct 10, 10:01 AM
RKSimon committed rL315322: [X86][AVX512] Regenerate element insertion/extraction tests.
[X86][AVX512] Regenerate element insertion/extraction tests
Tue, Oct 10, 8:59 AM
RKSimon added inline comments to D38732: [X86][AVX512] Improve lowering of AVX512 test intrinsics.
Tue, Oct 10, 8:20 AM
RKSimon added a comment to D38732: [X86][AVX512] Improve lowering of AVX512 test intrinsics.

Add the new test files to trunk with current codegen and then rebase to show the diff from this patch.

Tue, Oct 10, 8:06 AM
RKSimon committed rL315314: Fix a (slightly weird) 'comma operator within array index expression' warning….
Fix a (slightly weird) 'comma operator within array index expression' warning…
Tue, Oct 10, 6:56 AM
RKSimon added inline comments to D38684: [X86][AVX512] lowering broadcastm intrinsic - llvm part.
Tue, Oct 10, 6:03 AM
RKSimon added inline comments to D38683: [X86][AVX512] lowering broadcastm intrinsic - clang part.
Tue, Oct 10, 6:01 AM
RKSimon accepted D38388: [DAGCombiner, x86] convert insertelement of bitcasted vector into shuffle.

LGTM

Tue, Oct 10, 5:59 AM
RKSimon added a comment to D38697: [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle..

This patch seems to change the blending-shuffle.ll test case in the same way as D38693 - what is the relationship/dependency between them?

Tue, Oct 10, 4:37 AM
RKSimon added a comment to D38495: [X86] Fix bug in legalize vector types - Split large loads.

The bug was filed for Clang's 3.8 version, it passed with clang-4.0 and clang-ToT (top of trunc).
Seems like the issue was fixed while ago.

Tue, Oct 10, 4:34 AM
RKSimon added a comment to D38466: [ TargetLowering, AMDGPU] Use the return value of UpdateNodeOperands(); .

Where did the code change go in the diff?

Tue, Oct 10, 4:33 AM
RKSimon added inline comments to D38714: [AVX512] Don't mark EXTLOAD as legal with AVX512. Continue using custom lowering..
Tue, Oct 10, 4:19 AM
RKSimon accepted D38689: [X86] Fix a bug with i386 subtarget in LowerCONCAT_VECTORSvXi1 func.

LGTM with one minor

Tue, Oct 10, 4:17 AM

Mon, Oct 9

RKSimon added inline comments to D38466: [ TargetLowering, AMDGPU] Use the return value of UpdateNodeOperands(); .
Mon, Oct 9, 1:51 PM
RKSimon accepted D38701: Fix LLDB build for Android..

LGTM

Mon, Oct 9, 1:51 PM
RKSimon added a reviewer for D38689: [X86] Fix a bug with i386 subtarget in LowerCONCAT_VECTORSvXi1 func: RKSimon.

Test case?

Mon, Oct 9, 1:21 PM
RKSimon created D38696: [DAGCombine] Permit combining of shuffle of equivalent splat BUILD_VECTORs.
Mon, Oct 9, 11:16 AM
RKSimon added a comment to D38684: [X86][AVX512] lowering broadcastm intrinsic - llvm part.

Please add fast-isel tests that match the builtin tests from D38683

Mon, Oct 9, 5:23 AM
RKSimon accepted D38685: [X86][SKYLAKE] Update regression test to differentiate between HASWELL and SKYLAKE scheduling.<NFC> .

LGTM - is it worth adding the SKX test lines as well while you're at it?

Mon, Oct 9, 4:40 AM

Sun, Oct 8

RKSimon committed rL315195: [X86][SSE] Don't call combineTo inside combineX86ShufflesRecursively. NFCI..
[X86][SSE] Don't call combineTo inside combineX86ShufflesRecursively. NFCI.
Sun, Oct 8, 2:00 PM
RKSimon committed rL315187: Tidyup with clang-format. NFCI..
Tidyup with clang-format. NFCI.
Sun, Oct 8, 12:26 PM
RKSimon committed rL315186: [X86][SSE] Add test case for PR27708.
[X86][SSE] Add test case for PR27708
Sun, Oct 8, 12:20 PM
RKSimon committed rL315182: [X86] getTargetConstantBitsFromNode - add support for decoding scalar constants.
[X86] getTargetConstantBitsFromNode - add support for decoding scalar constants
Sun, Oct 8, 10:23 AM
RKSimon added a comment to D38318: [X86][SSE] Match PSHUFLW/PSHUFHW + PSHUFD vXi16 shuffle patterns (PR34686).

@delena @zvi What do you want to do with this. IMO we shouldn't be prematurely combining to variable mask shuffles, and this should be performed later as a scheduler based decision. But that will involve a lot of work that I don't think we're ready for (D26855 tried to move some other code to the MC and hit a lot of issues).

Sun, Oct 8, 6:11 AM
RKSimon committed rL315176: [X86][XOP] Add XOP oddshuffles tests.
[X86][XOP] Add XOP oddshuffles tests
Sun, Oct 8, 6:00 AM
RKSimon added inline comments to D37896: [DAGCombine] Resolving PR34474 by transforming mul(x, 2^c +/- 1) -> sub/add(shl(x, c) x) for any type including vector types.
Sun, Oct 8, 4:22 AM
RKSimon added a comment to D36706: DAGCombiner: Add form of isFPExtFree to check uses.

A few style comments, but its up to whether @efriedma and the PPC guys are happy with this change.

Sun, Oct 8, 4:17 AM
RKSimon accepted D38443: [X86][SKX] Adding the scheduling information for the SKX target..

LGTM

Sun, Oct 8, 3:39 AM
RKSimon accepted D38023: [X86] Prefer MOVSS/SD over BLENDI during legalization. Remove BLENDI versions of scalar arithmetic patterns.

LGTM, please can you raise bugs for both custom handling of domain changes (PBLENDW <-> BLENDPS etc.) and adding isel patterns (optsize or not) for MOVSS/MOVSD with BLENDPS/BLENDPD?

Sun, Oct 8, 3:32 AM
RKSimon accepted D38609: [X86] Enable extended comparison predicate support for SETUEQ/SETONE when targeting AVX instructions..

Possibly regenerate fast-isel-select-pseudo-cmov.ll before the patch?

Sun, Oct 8, 3:19 AM