RKSimon (Simon Pilgrim)
User

Projects

User does not belong to any projects.

User Details

User Since
May 5 2014, 7:26 AM (159 w, 12 h)

Recent Activity

Today

RKSimon added a comment to D33406: PR28129 expand vector oparation to an IR constant..

Test _mm256_cmp_pd as well?

Mon, May 22, 5:18 AM
RKSimon added inline comments to D33311: [X86] Match bitcast of vsetcc to pmovmsk.
Mon, May 22, 3:45 AM

Yesterday

RKSimon added inline comments to D30471: [SDAG] Relax conditions under stores of loaded values can be merged.
Sun, May 21, 7:50 AM

Fri, May 19

RKSimon committed rL303448: Fix line-endings..
Fix line-endings.
Fri, May 19, 1:00 PM
RKSimon committed rL303435: [X86][FMA] Tests showing missed fmsubadd opportunities (PR30633).
[X86][FMA] Tests showing missed fmsubadd opportunities (PR30633)
Fri, May 19, 10:33 AM
RKSimon added inline comments to D32684: [X86] Adding new LLVM TableGen backend that generates the X86 backend memory folding tables..
Fri, May 19, 6:54 AM
RKSimon added inline comments to D32925: [DAGCombine] (add/uaddo X, Carry) -> (addcarry X, 0, Carry).
Fri, May 19, 6:50 AM
RKSimon accepted D33310: [APInt] Add support for dividing or remainder by a uint64_t or int64_t..

LGTM

Fri, May 19, 6:44 AM

Thu, May 18

RKSimon added a comment to D32916: [DAGCombine] (addcarry 0, 0, X) -> (ext/trunc X).

I suggest that we just delete mul-i1024.ll before accepting this patch - any objections?

Thu, May 18, 6:25 AM
RKSimon added a comment to D33311: [X86] Match bitcast of vsetcc to pmovmsk.

Why don't you use MOVMSKPD/MOVMSKPS for the 32/64 bit cases and avoid the vector truncation?

Thu, May 18, 5:35 AM
RKSimon committed rL303344: Fix 'not all control paths return a value' warning on windows buildbots..
Fix 'not all control paths return a value' warning on windows buildbots.
Thu, May 18, 4:01 AM
RKSimon committed rL303342: [X86][AVX512] Add 512-bit vector ctpop costs + tests.
[X86][AVX512] Add 512-bit vector ctpop costs + tests
Thu, May 18, 3:56 AM
RKSimon added a comment to D33310: [APInt] Add support for dividing or remainder by a uint64_t or int64_t..

Add extra tests?

Thu, May 18, 3:15 AM

Wed, May 17

RKSimon committed rL303300: [X86][AVX512] Add 512-bit vector ctlz costs + tests.
[X86][AVX512] Add 512-bit vector ctlz costs + tests
Wed, May 17, 2:15 PM
RKSimon committed rL303293: [X86][AVX512] Add 512-bit vector cttz costs + tests.
[X86][AVX512] Add 512-bit vector cttz costs + tests
Wed, May 17, 1:36 PM
RKSimon committed rL303290: [X86] Split ctpop/ctlz/cttz cost tests.
[X86] Split ctpop/ctlz/cttz cost tests
Wed, May 17, 1:10 PM
RKSimon committed rL303283: [X86][AVX512] Add 512-bit vector bitreverse costs + tests.
[X86][AVX512] Add 512-bit vector bitreverse costs + tests
Wed, May 17, 12:33 PM
RKSimon added a comment to D33169: [X86] Adding vpopcntd and vpopcntq instructions.

The test files need some attention.

Wed, May 17, 12:02 PM
RKSimon accepted D32273: [X86][AVX512] Make i1 illegal in the CodeGen.

LGTM - I'm happy for the missed constant fold to be handled in a followup, add a TODO if you can.

Wed, May 17, 11:29 AM
RKSimon accepted D33168: Fix PR33028.

LGTM

Wed, May 17, 10:41 AM
RKSimon added a comment to D33168: Fix PR33028.

I think this is sound, but I'm not an expert on any of this. Any other comments?

Wed, May 17, 6:10 AM
RKSimon added inline comments to D33169: [X86] Adding vpopcntd and vpopcntq instructions.
Wed, May 17, 6:04 AM

Tue, May 16

RKSimon added inline comments to D33099: AMD Jaguar scheduler doesn't correctly model 256-bit AVX instructions.
Tue, May 16, 2:27 PM
RKSimon added a comment to D33168: Fix PR33028.

Please can you regenerate the diff with context?

Tue, May 16, 2:10 PM
RKSimon added a reviewer for D33203: Add scheduler classes to integer/float horizontal operations: andreadb.

This needs to be done in general, not just for Jaguar. Please can you add WriteFHAdd and WriteVecHAdd defs in X86Schedule.td, and then tag the relevant instructions in X86InstrSSE.td and X86InstrAVX512.td. Then in ScheduleBtVer2.td you need to add instances of the 2 defs and special case the ymm versions. Either add TODOs for the other x86 models or add them if you want to dig through Agner's tables.

Tue, May 16, 1:50 PM

Mon, May 15

RKSimon added a comment to D33169: [X86] Adding vpopcntd and vpopcntq instructions.

A possible addition would be to custom lower i8/i16 vectors with a trunc(popcnt(zext))) pattern.

Mon, May 15, 2:14 PM
RKSimon added a comment to D33169: [X86] Adding vpopcntd and vpopcntq instructions.

Do we have a generic ctpop test like we do for tzcnt and lzcnt? If so should we just add command lines to that instead of a new intrinsic test?

Mon, May 15, 2:05 PM
RKSimon added inline comments to D33169: [X86] Adding vpopcntd and vpopcntq instructions.
Mon, May 15, 12:45 PM
RKSimon committed rL303082: [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as….
[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as…
Mon, May 15, 10:31 AM
RKSimon closed D33189: [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) by committing rL303082: [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as….
Mon, May 15, 10:31 AM
RKSimon committed rL303078: Fix windows buildbots - missing include and namespace.
Fix windows buildbots - missing include and namespace
Mon, May 15, 9:49 AM
RKSimon committed rL303074: [SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8….
[SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8…
Mon, May 15, 9:01 AM
RKSimon committed rL303069: [SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8 shifts.
[SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8 shifts
Mon, May 15, 7:40 AM
RKSimon created D33189: [NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146).
Mon, May 15, 5:14 AM
RKSimon added a comment to D32680: [X86] Apply the new instruction's register classes constraints on the operands of the replaced instruction when memory folding.

@craig.topper, i encountered this issue by coincidence but could not find a test case which fails without this. all the folded instructions right now have the same register classes constrains for both register and memory forms.
But as we entirely replace the instruction I saw that this change should be added even though it doesn't effect anything right now. Maybe it will be needed in the future.

Mon, May 15, 4:10 AM
RKSimon committed rL303047: [NVPTX] Don't rely on default arguments to SelectionDAG::getMemIntrinsicNode..
[NVPTX] Don't rely on default arguments to SelectionDAG::getMemIntrinsicNode.
Mon, May 15, 4:01 AM

Sun, May 14

RKSimon committed rL303023: [X86][AVX1] Account for cost of extract/insert of 256-bit shifts.
[X86][AVX1] Account for cost of extract/insert of 256-bit shifts
Sun, May 14, 2:05 PM
RKSimon committed rL303022: [X86][AVX2] Fix costs for v4i64 ashr by splat.
[X86][AVX2] Fix costs for v4i64 ashr by splat
Sun, May 14, 1:39 PM
RKSimon committed rL303021: [X86][AVX1] Account for cost of extract/insert of 256-bit shifts by splat.
[X86][AVX1] Account for cost of extract/insert of 256-bit shifts by splat
Sun, May 14, 1:16 PM
RKSimon committed rL303017: [X86][AVX1] Account for cost of extract/insert of 256-bit SDIV/UDIV by mul….
[X86][AVX1] Account for cost of extract/insert of 256-bit SDIV/UDIV by mul…
Sun, May 14, 12:05 PM
RKSimon committed rL303013: [X86][XOP] XOP's general v16i8 shifts will be used instead of v8i16 shift +….
[X86][XOP] XOP's general v16i8 shifts will be used instead of v8i16 shift +…
Sun, May 14, 11:13 AM
RKSimon committed rL303012: [X86][SSE] Account for cost of extract/insert of v32i8 vector shifts.
[X86][SSE] Account for cost of extract/insert of v32i8 vector shifts
Sun, May 14, 10:49 AM
RKSimon added a comment to D33169: [X86] Adding vpopcntd and vpopcntq instructions.

Disassembler tests?

Sun, May 14, 10:40 AM
RKSimon committed rL303010: [X86][XOP] Account for cost of extract/insert of 256-bit vector shifts.
[X86][XOP] Account for cost of extract/insert of 256-bit vector shifts
Sun, May 14, 6:52 AM
RKSimon added a reviewer for D33170: [X86] Adding avx512_vpopcntdq feature set and its intrinsics: RKSimon.
Sun, May 14, 5:36 AM
RKSimon added a reviewer for D33169: [X86] Adding vpopcntd and vpopcntq instructions: RKSimon.

Please can you rebase against trunk latest?

Sun, May 14, 5:29 AM
RKSimon committed rL303009: [X86][AVX] Allow 32-bit targets to peek through subvectors to extract constant….
[X86][AVX] Allow 32-bit targets to peek through subvectors to extract constant…
Sun, May 14, 4:59 AM
RKSimon committed rL303008: [X86][AVX] Add additional 32-bit target vector shift tests.
[X86][AVX] Add additional 32-bit target vector shift tests
Sun, May 14, 4:26 AM
RKSimon added inline comments to D33166: [ValueTracking] Replace all uses of ComputeSignBit with computeKnownBits..
Sun, May 14, 3:41 AM
RKSimon added a comment to D33168: Fix PR33028.

Thank you for looking at this.

Sun, May 14, 3:27 AM

Sat, May 13

RKSimon committed rL302997: [SelectionDAG] Added support for EXTRACT_SUBVECTOR/CONCAT_VECTORS demandedelts….
[SelectionDAG] Added support for EXTRACT_SUBVECTOR/CONCAT_VECTORS demandedelts…
Sat, May 13, 3:24 PM
RKSimon committed rL302994: [X86][SSE] Test showing missing EXTRACT_SUBVECTOR/CONCAT_VECTORS demandedelts….
[X86][SSE] Test showing missing EXTRACT_SUBVECTOR/CONCAT_VECTORS demandedelts…
Sat, May 13, 3:03 PM
RKSimon committed rL302993: [SelectionDAG] Add VECTOR_SHUFFLE support to ComputeNumSignBits.
[SelectionDAG] Add VECTOR_SHUFFLE support to ComputeNumSignBits
Sat, May 13, 1:10 PM
RKSimon committed rL302992: [X86][SSE] Test showing inability of ComputeNumSignBits to resolve shuffles.
[X86][SSE] Test showing inability of ComputeNumSignBits to resolve shuffles
Sat, May 13, 10:54 AM
RKSimon added inline comments to D32684: [X86] Adding new LLVM TableGen backend that generates the X86 backend memory folding tables..
Sat, May 13, 8:11 AM
RKSimon committed rL302989: [x86, SSE] AVX1 PR28129 (256-bit all-ones rematerialization).
[x86, SSE] AVX1 PR28129 (256-bit all-ones rematerialization)
Sat, May 13, 6:56 AM
RKSimon closed D32416: [x86, SSE] AVX1 PR28129 by committing rL302989: [x86, SSE] AVX1 PR28129 (256-bit all-ones rematerialization).
Sat, May 13, 6:56 AM
RKSimon committed rL302988: [LoopOptimizer][Fix]PR32859, PR24738.
[LoopOptimizer][Fix]PR32859, PR24738
Sat, May 13, 6:39 AM
RKSimon closed D33055: [LoopOptimizer][Fix]PR32859, PR24738 by committing rL302988: [LoopOptimizer][Fix]PR32859, PR24738.
Sat, May 13, 6:39 AM
RKSimon added inline comments to D32273: [X86][AVX512] Make i1 illegal in the CodeGen.
Sat, May 13, 5:54 AM

Fri, May 12

RKSimon accepted D32416: [x86, SSE] AVX1 PR28129 .

LGTM - with a future patch to investigate using ExecutionDepsFix

Fri, May 12, 3:06 PM
RKSimon committed rL302942: [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146).
[NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146)
Fri, May 12, 1:10 PM
RKSimon closed D33147: [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146) by committing rL302942: [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146).
Fri, May 12, 1:09 PM
RKSimon created D33147: [NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146).
Fri, May 12, 12:25 PM
RKSimon added inline comments to D33137: [DAGCombiner] use narrow vector ops to eliminate concat/extract (PR32790).
Fri, May 12, 11:13 AM
RKSimon committed rL302927: Strip trailing whitespace. NFCI..
Strip trailing whitespace. NFCI.
Fri, May 12, 10:55 AM
RKSimon committed rL302907: [DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI..
[DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI.
Fri, May 12, 8:40 AM
RKSimon committed rL302897: [DAGCombine] Use SelectionDAG::getZExtOrTrunc helper. NFCI..
[DAGCombine] Use SelectionDAG::getZExtOrTrunc helper. NFCI.
Fri, May 12, 6:35 AM
RKSimon committed rL302896: Use SDValue::getOperand() helper. NFCI..
Use SDValue::getOperand() helper. NFCI.
Fri, May 12, 6:33 AM
RKSimon committed rL302894: Use SDValue::getOperand() helper. NFCI..
Use SDValue::getOperand() helper. NFCI.
Fri, May 12, 6:22 AM
RKSimon added a comment to D32218: X86AsmParser.cpp asserts: OperandStack.size() > 1 && "Too few operands.".

@mcrosier Any comments?

Fri, May 12, 3:49 AM
RKSimon added a comment to D32508: [ValueTracking] Begin adding some useful methods to the proposed KnownBits struct.

Is this patch still relevant?

Fri, May 12, 3:45 AM
RKSimon accepted D32931: [KnownBits] Add bit counting methods to KnownBits struct and use them where possible.

LGTM with a couple of minor queries

Fri, May 12, 3:37 AM
RKSimon accepted D33116: [APInt] Use MathExtras.h BitsToFloat/Double and Float/DoubleToBits instead of type punning through a union.

LGTM

Fri, May 12, 3:29 AM

Thu, May 11

RKSimon added inline comments to D32684: [X86] Adding new LLVM TableGen backend that generates the X86 backend memory folding tables..
Thu, May 11, 2:47 PM
RKSimon accepted D33073: [APInt] Add a utility method to change the bit width and storage size of an APInt..

LGTM

Thu, May 11, 2:41 PM
RKSimon committed rL302808: [DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI..
[DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI.
Thu, May 11, 9:54 AM
RKSimon committed rL302804: [X86][AVX] Added zeroall/zeroupper scheduler tests.
[X86][AVX] Added zeroall/zeroupper scheduler tests
Thu, May 11, 8:16 AM
RKSimon added a reviewer for D33099: AMD Jaguar scheduler doesn't correctly model 256-bit AVX instructions: andreadb.
Thu, May 11, 6:51 AM
RKSimon added inline comments to D33099: AMD Jaguar scheduler doesn't correctly model 256-bit AVX instructions.
Thu, May 11, 6:41 AM
RKSimon committed rL302784: Strip trailing whitespace. NFCI..
Strip trailing whitespace. NFCI.
Thu, May 11, 3:16 AM
RKSimon accepted D32797: [X86] Moving X86Local namespace from .cpp to .h file to use it in memory folding TableGen backend..

LGTM

Thu, May 11, 3:09 AM
RKSimon added inline comments to D33073: [APInt] Add a utility method to change the bit width and storage size of an APInt..
Thu, May 11, 3:02 AM

Wed, May 10

RKSimon committed rL302683: [X86][SSE] Check vec_set BUILD_VECTOR tests on both 32 and 64-bit targets.
[X86][SSE] Check vec_set BUILD_VECTOR tests on both 32 and 64-bit targets
Wed, May 10, 9:06 AM
RKSimon committed rL302651: [DAGCombiner] Dropped explicit (sra 0, x) -> 0 and (sra -1, x) -> 0 folds..
[DAGCombiner] Dropped explicit (sra 0, x) -> 0 and (sra -1, x) -> 0 folds.
Wed, May 10, 6:19 AM
RKSimon committed rL302641: [DAGCombiner] Add vector support to fold (shl/srl 0, x) -> 0.
[DAGCombiner] Add vector support to fold (shl/srl 0, x) -> 0
Wed, May 10, 5:47 AM

Tue, May 9

RKSimon added a comment to D32916: [DAGCombine] (addcarry 0, 0, X) -> (ext/trunc X).

See inline for a few nits, but I think this makes sense now. If I'm seeing the diffs correctly, there was no codegen difference for x86 after adding the 'and' mask, so we must be recognizing and optimizing that pattern. @RKSimon - do you see any other problems?

Tue, May 9, 1:13 PM
RKSimon committed rL302559: [X86][LWP] Remove MSVC LWP intrinsics stubs..
[X86][LWP] Remove MSVC LWP intrinsics stubs.
Tue, May 9, 11:03 AM
RKSimon committed rL302557: [X86][LWP] Removing LWP todo comment. NFCI..
[X86][LWP] Removing LWP todo comment. NFCI.
Tue, May 9, 10:56 AM
RKSimon committed rL302525: [X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X).
[X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X)
Tue, May 9, 6:28 AM
RKSimon closed D32973: [X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X) by committing rL302525: [X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X).
Tue, May 9, 6:28 AM
RKSimon added a comment to D32797: [X86] Moving X86Local namespace from .cpp to .h file to use it in memory folding TableGen backend..

Should X86Local be inside the llvm namespace?

Tue, May 9, 3:25 AM
RKSimon accepted D32616: [X86] Add more patterns for BZHI isel.

Ryzen returns 0xffffffff as well.

Tue, May 9, 2:49 AM

Mon, May 8

RKSimon added inline comments to D32973: [X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X).
Mon, May 8, 12:09 PM
RKSimon updated the diff for D32973: [X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X).

Added v4i64/AVX2 assertion

Mon, May 8, 12:07 PM
RKSimon accepted D32394: Add extra operand to CALLSEQ_START to keep frame part set up previously.

LGTM

Mon, May 8, 11:24 AM
RKSimon added inline comments to D32973: [X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X).
Mon, May 8, 11:22 AM
RKSimon created D32973: [X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X).
Mon, May 8, 10:53 AM
RKSimon added a comment to D32770: [X86][LWP] Add clang support for LWP instructions..

Sorry I missed this patch, but shouldn't we had LWP to the relevant processors in test/Preprocessor/predefined-arch-macros.c and the command line switch testing to test/Preprocessor/x86_target_features.c

Mon, May 8, 10:40 AM
RKSimon committed rL302445: [X86][LWP] Add __LWP__ macro tests.
[X86][LWP] Add __LWP__ macro tests
Mon, May 8, 10:39 AM