RKSimon (Simon Pilgrim)
User

Projects

User does not belong to any projects.

User Details

User Since
May 5 2014, 7:26 AM (163 w, 4 d)

Recent Activity

Today

RKSimon abandoned D34087: [X86] EltsFromConsecutiveLoads - detect split loads without a common load base (PR32940).

D34472 is providing a more general solution

Fri, Jun 23, 11:37 AM
RKSimon added a reviewer for D34559: [X86][DAG] Switch X86 Target to post-legalized store merge: filcab.
Fri, Jun 23, 11:36 AM
RKSimon committed rL306138: [X86][AVX] Regenerate i256 bitcasted store test.
[X86][AVX] Regenerate i256 bitcasted store test
Fri, Jun 23, 11:35 AM
RKSimon committed rL306133: Fix Wdocumentation warning..
Fix Wdocumentation warning.
Fri, Jun 23, 11:03 AM
RKSimon added a comment to D34472: [DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset.

Some minor comments.

Fri, Jun 23, 10:56 AM
RKSimon committed rL306131: Regenerate extract-store.ll tests.
Regenerate extract-store.ll tests
Fri, Jun 23, 10:20 AM
RKSimon committed rL306121: Remove trailing whitespace. NFCI..
Remove trailing whitespace. NFCI.
Fri, Jun 23, 9:36 AM
RKSimon committed rL306107: [X86][AVX] Extended vector average tests.
[X86][AVX] Extended vector average tests
Fri, Jun 23, 7:38 AM
RKSimon committed rL306104: [X86][SSE] Dropped -mcpu from vector average tests.
[X86][SSE] Dropped -mcpu from vector average tests
Fri, Jun 23, 7:17 AM
RKSimon committed rL306101: Fix double->float truncation warning on MSVC.
Fix double->float truncation warning on MSVC
Fri, Jun 23, 6:54 AM
RKSimon committed rL306097: [X86][SSE] Dropped -mcpu from scalar math tests.
[X86][SSE] Dropped -mcpu from scalar math tests
Fri, Jun 23, 6:08 AM
RKSimon committed rL306092: [X86][SSE] Dropped -mcpu from insertps tests.
[X86][SSE] Dropped -mcpu from insertps tests
Fri, Jun 23, 4:01 AM

Yesterday

RKSimon added a comment to D34157: [llvm-stress] Use C++11 mersenne_twister_engine random device instead of our own (PR32585).

ping?

Thu, Jun 22, 2:29 PM
RKSimon added a reviewer for D34504: [LLVM][X86][Goldmont] Adding new target-cpu: Goldmont: RKSimon.

Add goldmount to llvm\test\CodeGen\X86\cpus.ll

Thu, Jun 22, 2:24 PM
RKSimon added a reviewer for D34503: AVX-512: Fixed a crash during legalization of <3 x i8> type: RKSimon.
Thu, Jun 22, 2:16 PM
RKSimon accepted D34389: [AVX-512] Remove and autoupgrade the masked integer compare intrinsics.

LGTM

Thu, Jun 22, 10:55 AM
RKSimon accepted D32658: Supports lowerInterleavedStore() in X86InterleavedAccess..

Hi Simon,

Do you have further comments/concerns?

Farhana

Thu, Jun 22, 9:09 AM
RKSimon added a comment to D34472: [DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset.
  • test/CodeGen/X86/clear_upper_vector_element_bits.ll - DAG is improved, but scheduling causes a code size degradation. This is probably okay.
Thu, Jun 22, 3:07 AM

Wed, Jun 21

RKSimon added a comment to D33435: [SelectionDAG] reset NewNodesMustHaveLegalTypes flag between basic blocks .

@t.p.northover Are you happy for the AArch64 codegen regression to be raised as a bug?

Wed, Jun 21, 12:34 PM
RKSimon added a comment to D34141: [X86] Recognize constant arrays with special values and replace loads from it with subtract and shift instructions, which then may be replaced by BZHI machine instruction..

While I don't want to end up forcing you to over-engineer this, it seems this pass is currently very specific and should be generalised to make it straightforward to work for constants arrays of other patterns that are easily re-materializable (upper/lower masks, single bit masks, etc.) - "SubstituteLoadWithRematerializationPass", possibly driven by the TTI cost models?

I agree that the name of the pass should be more generic.
Do you mean renaming by "generalization" ? Or you propose to add this transformation for all targets and ask TTI about profitability of this transformation?

Wed, Jun 21, 10:14 AM
RKSimon accepted D33517: [InstCombine] reverse bitcast + bitwise-logic canonicalization (PR33138).

LGTM. Please do the peekThroughBitcast helper as a NFC pre-commit.

Wed, Jun 21, 9:27 AM
RKSimon committed rL305916: [X86][SSE] Dropped -mcpu from 256-bit vector shuffle tests.
[X86][SSE] Dropped -mcpu from 256-bit vector shuffle tests
Wed, Jun 21, 7:52 AM
RKSimon committed rL305913: [X86][SSE] Dropped -mcpu from 128-bit vector shuffle tests.
[X86][SSE] Dropped -mcpu from 128-bit vector shuffle tests
Wed, Jun 21, 7:23 AM
RKSimon committed rL305910: [X86][SSE] Regenerate merge store tests.
[X86][SSE] Regenerate merge store tests
Wed, Jun 21, 6:47 AM
RKSimon committed rL305909: [X86][SSE] Dropped -mcpu from vector blend shuffle tests and regenerate.
[X86][SSE] Dropped -mcpu from vector blend shuffle tests and regenerate
Wed, Jun 21, 6:46 AM
RKSimon committed rL305908: [X86][SSE] Dropped -mcpu from vector shuffle tests.
[X86][SSE] Dropped -mcpu from vector shuffle tests
Wed, Jun 21, 6:27 AM
RKSimon committed rL305907: [X86][SSE] Dropped -mcpu from vector zero extend tests.
[X86][SSE] Dropped -mcpu from vector zero extend tests
Wed, Jun 21, 6:18 AM
RKSimon committed rL305906: [X86][SSE] Dropped -mcpu from variable shuffle tests.
[X86][SSE] Dropped -mcpu from variable shuffle tests
Wed, Jun 21, 6:16 AM
RKSimon committed rL305905: [X86][AVX] Add AVX1 shuffle truncation tests.
[X86][AVX] Add AVX1 shuffle truncation tests
Wed, Jun 21, 5:59 AM
RKSimon committed rL305904: [X86][SSE] Add SSE2/SSE42 shuffle truncation tests.
[X86][SSE] Add SSE2/SSE42 shuffle truncation tests
Wed, Jun 21, 5:59 AM
RKSimon updated subscribers of D34141: [X86] Recognize constant arrays with special values and replace loads from it with subtract and shift instructions, which then may be replaced by BZHI machine instruction..

While I don't want to end up forcing you to over-engineer this, it seems this pass is currently very specific and should be generalised to make it straightforward to work for constants arrays of other patterns that are easily re-materializable (upper/lower masks, single bit masks, etc.) - "SubstituteLoadWithRematerializationPass", possibly driven by the TTI cost models?

Wed, Jun 21, 5:49 AM
RKSimon added a comment to D34336: [x86] transform vector inc/dec to use -1 constant (PR33483).

Does anyone have any concerns about this? Otherwise I'd like to accept it.

Wed, Jun 21, 5:36 AM
RKSimon accepted D32582: [InstCombine] Add range metadata to cttz/ctlz/ctpop intrinsic calls based on known bits.

LGTM, with the addition of some ctpop vector tests, even if they don't currently combine.

Wed, Jun 21, 5:27 AM
RKSimon accepted D33983: update add\sub costs of vectors of 64 in X86\SLM arch.

LGTM with one minor about the comment in the lv test

Wed, Jun 21, 5:21 AM

Tue, Jun 20

RKSimon committed rL305810: [CostModel][X86] Add scalar arithmetic cost tests.
[CostModel][X86] Add scalar arithmetic cost tests
Tue, Jun 20, 10:11 AM
RKSimon committed rL305808: [CostModel][X86] Declare costs variables based on type.
[CostModel][X86] Declare costs variables based on type
Tue, Jun 20, 10:05 AM
RKSimon committed rL305801: [X86][SSE] Relax 0/-1 vector element insertion to work for any vector with….
[X86][SSE] Relax 0/-1 vector element insertion to work for any vector with…
Tue, Jun 20, 8:20 AM
RKSimon committed rL305790: Fix Wdocumentation warning.
Fix Wdocumentation warning
Tue, Jun 20, 5:29 AM
RKSimon committed rL305788: [X86][SSE] Dropped old INSERT_VECTOR_ELT lowering TODO.
[X86][SSE] Dropped old INSERT_VECTOR_ELT lowering TODO
Tue, Jun 20, 3:34 AM
RKSimon committed rL305787: Fixed test name. NFCI..
Fixed test name. NFCI.
Tue, Jun 20, 3:24 AM

Mon, Jun 19

RKSimon added inline comments to D34069: [DAGCombiner] Fix PR33368 (vector extend/truncate optimization).
Mon, Jun 19, 9:30 AM
RKSimon committed rL305693: Use range for loops. NFCI..
Use range for loops. NFCI.
Mon, Jun 19, 6:25 AM
RKSimon added a comment to D34341: [TableGen] Fix bug in TableGen CodeGenPatterns when adding variants of the patterns..

Any chance of a test case?

Mon, Jun 19, 4:16 AM
RKSimon added a reviewer for D34095: [DAG] Prevent CombineTo from deleting already deleted nodes: chandlerc.

Adding @chandlerc who refactored this code at rL214623

Mon, Jun 19, 4:13 AM
RKSimon added a comment to D33840: [DAGCombine] Do not try to deduplicate commutative operations if both operand are the same..

Do you have any stats?

Mon, Jun 19, 4:06 AM
RKSimon added a comment to D34336: [x86] transform vector inc/dec to use -1 constant (PR33483).

vpternlog does not have any idiom recognition.

For pcmpeq I think intel only avoids the dependency but still executes it. What does AMD do?

Mon, Jun 19, 4:04 AM

Thu, Jun 15

RKSimon added a comment to D28907: [SLP] Fix for PR30787: Failure to beneficially vectorize 'copyable' elements in integer binary ops..

Some minor observations. It'd be great it we could reduce the size of this patch some how....

Thu, Jun 15, 9:44 AM
RKSimon committed rL305476: Remove trailing whitespace. NFCI..
Remove trailing whitespace. NFCI.
Thu, Jun 15, 9:21 AM
RKSimon committed rL305472: [X86][AVX2] Fix issue in lowerV8I16GeneralSingleInputVectorShuffle that was….
[X86][AVX2] Fix issue in lowerV8I16GeneralSingleInputVectorShuffle that was…
Thu, Jun 15, 7:53 AM
RKSimon added a comment to D33188: [X86][AVX512] Improve lowering of AVX512 compare intrinsics (remove redundant shift left+right instructions)..

@aymanmus I'm sorry but I had to revert this at rL305470 as it's causing tablegen crashes on windows buildbots: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/10612

Thu, Jun 15, 7:41 AM
RKSimon committed rL305470: Revert r305465: [X86][AVX512] Improve lowering of AVX512 compare intrinsics….
Revert r305465: [X86][AVX512] Improve lowering of AVX512 compare intrinsics…
Thu, Jun 15, 7:40 AM
RKSimon updated the diff for D34157: [llvm-stress] Use C++11 mersenne_twister_engine random device instead of our own (PR32585).

Updated as requested by @chandlerc - my only concern is I have no idea if uniform_int_distribution guarantees the same behaviour on different targets as mersene does.

Thu, Jun 15, 5:31 AM

Wed, Jun 14

RKSimon added a comment to D34089: [llvm-stress] Ensure that the C++11 random device respects its min/max values (PR32585).

Is there a C++11 random class that covers the features we need (same results on different targets, drivable by input seed)?

Given that the libc++ mersenne_twister_engine tests test specifically that, yes.

see <libc++ root>/test/std/numerics/rand/rand.eng/rand.eng.mers/ctor_sseq.pass.cpp

Wed, Jun 14, 2:56 AM
RKSimon added a comment to D33983: update add\sub costs of vectors of 64 in X86\SLM arch.

I've added the SLP vectorization tests at rL305151

Simon I saw that you have added that to the tests.
Is there is a need to do something else?

Wed, Jun 14, 1:56 AM

Tue, Jun 13

RKSimon accepted D34174: [x86] replace div/rem with shift/mask for shuffle combining.

LGTM

Tue, Jun 13, 3:02 PM
RKSimon created D34157: [llvm-stress] Use C++11 mersenne_twister_engine random device instead of our own (PR32585).
Tue, Jun 13, 11:53 AM
RKSimon added a comment to D34089: [llvm-stress] Ensure that the C++11 random device respects its min/max values (PR32585).

After thinking about this, I agree with @bogner 's comment that we should just remove the whole Random class, and use the C++11 random number facilities instead.

Tue, Jun 13, 9:54 AM
RKSimon edited reviewers for D34143: Handling of TRAP during isel, added: efriedma; removed: eli.friedman.
Tue, Jun 13, 7:01 AM
RKSimon added a comment to D34087: [X86] EltsFromConsecutiveLoads - detect split loads without a common load base (PR32940).

It looks like this is really an issue of "areNonVolatileConsecutiveLoads" not being clever enough. BaseIndexOffset in DAGCombiner should do better. I'd favor moving BaseIndexOffset out of the DAG Combiner and rewriting areNonVolatileConsecutiveLoads off of that.

Tue, Jun 13, 5:23 AM
RKSimon updated the diff for D34089: [llvm-stress] Ensure that the C++11 random device respects its min/max values (PR32585).

Added assertion that the random value doesn't exceed the maximum.

Tue, Jun 13, 4:17 AM
RKSimon committed rL305285: Strip UTF8 BOM that got added in rL305091.
Strip UTF8 BOM that got added in rL305091
Tue, Jun 13, 3:18 AM
RKSimon committed rL305284: [X86][SSE] Refactor getTargetConstantBitsFromNode to avoid large APInts….
[X86][SSE] Refactor getTargetConstantBitsFromNode to avoid large APInts…
Tue, Jun 13, 3:14 AM
RKSimon committed rL305282: Strip UTF8 BOM that got added for some reason in rL305163.
Strip UTF8 BOM that got added for some reason in rL305163
Tue, Jun 13, 2:59 AM

Mon, Jun 12

RKSimon added a comment to D34087: [X86] EltsFromConsecutiveLoads - detect split loads without a common load base (PR32940).

Also, popping up a level, do you know if there's a reason why this is done as an X86-specific pass? It seems entirely generic.

Mon, Jun 12, 8:06 AM
RKSimon added a reviewer for D34087: [X86] EltsFromConsecutiveLoads - detect split loads without a common load base (PR32940): filcab.
Mon, Jun 12, 5:42 AM
RKSimon committed rL305184: [X86][SSE] Change memop fragment to inherit from vec128load with local….
[X86][SSE] Change memop fragment to inherit from vec128load with local…
Mon, Jun 12, 3:02 AM
RKSimon closed D33902: [X86][SSE] Change memop fragment to inherit from vec128load with local alignment controls by committing rL305184: [X86][SSE] Change memop fragment to inherit from vec128load with local….
Mon, Jun 12, 3:02 AM
RKSimon added reviewers for D34056: Tail merge size: davide, filcab.

A few minors, but I'd prefer someone more knowledgable in this area to review it.

Mon, Jun 12, 2:23 AM

Sun, Jun 11

RKSimon committed rL305163: Fix unused variable warning on non-debug EXPENSIVE_CHECKS builds.
Fix unused variable warning on non-debug EXPENSIVE_CHECKS builds
Sun, Jun 11, 5:50 AM
RKSimon created D34089: [llvm-stress] Ensure that the C++11 random device respects its min/max values (PR32585).
Sun, Jun 11, 5:45 AM
RKSimon created D34087: [X86] EltsFromConsecutiveLoads - detect split loads without a common load base (PR32940).
Sun, Jun 11, 3:54 AM

Sat, Jun 10

RKSimon committed rL305154: [X86][SSE] Extended PR32368 to SSE/AVX1/AVX2.
[X86][SSE] Extended PR32368 to SSE/AVX1/AVX2
Sat, Jun 10, 2:13 PM
RKSimon committed rL305153: [X86][AVX512] Added test case for PR32368.
[X86][AVX512] Added test case for PR32368
Sat, Jun 10, 1:59 PM
RKSimon updated the diff for D19325: DAGCombine: (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2).

Rebaing - this is part of constant canonicalizations to try and combine MUL/SHL ops separated by ADD/OR ops.

Sat, Jun 10, 1:24 PM
RKSimon commandeered D19325: DAGCombine: (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2).
Sat, Jun 10, 1:18 PM
RKSimon added a comment to D33983: update add\sub costs of vectors of 64 in X86\SLM arch.

I've added the SLP vectorization tests at rL305151

Sat, Jun 10, 12:17 PM
RKSimon committed rL305151: [X86][SLM] Add SLM arithmetic vectorization tests.
[X86][SLM] Add SLM arithmetic vectorization tests
Sat, Jun 10, 12:16 PM
RKSimon accepted D33938: [x86] use vperm2f128 rather than vinsertf128 when there's a chance to fold a 32-byte load.

Looking at lowerV2X128VectorShuffle, shuffle combining will have a much easier time if we keep to 256-bit vectors (blends / X86ISD::VPERM2X128) as much as possible - subvector extract/insert chains makes combining really tricky - and this dealing with memory cases looks like a good first step.

Sat, Jun 10, 9:39 AM
RKSimon added a comment to D34069: [DAGCombiner] Fix PR33368 (vector extend/truncate optimization).

If possible I'd like to see a regular llc test as well as the mir test.

Sat, Jun 10, 6:12 AM
RKSimon added a comment to D34077: DAGCombine: Combine BUILD_VECTOR to TRUNCATE.

A few minors, but would like to see ARM fixed first.

Sat, Jun 10, 6:10 AM

Fri, Jun 9

RKSimon committed rL305091: [X86][SSE] Add support for PACKSS nodes to faux shuffle extraction.
[X86][SSE] Add support for PACKSS nodes to faux shuffle extraction
Fri, Jun 9, 10:30 AM
RKSimon updated subscribers of D34056: Tail merge size.
Fri, Jun 9, 9:23 AM
RKSimon added a comment to D33902: [X86][SSE] Change memop fragment to inherit from vec128load with local alignment controls.

ping?

Fri, Jun 9, 8:19 AM

Thu, Jun 8

RKSimon committed rL304988: Wdocumentation fix..
Wdocumentation fix.
Thu, Jun 8, 10:01 AM
RKSimon accepted D33203: Add scheduler classes to integer/float horizontal operations.

LGTM

Thu, Jun 8, 4:06 AM
RKSimon committed rL304973: Regenerate test.
Regenerate test
Thu, Jun 8, 3:25 AM

Wed, Jun 7

RKSimon added a comment to D29402: [SLP] Initial rework for min/max horizontal reduction vectorization, NFC..

A couple of NFC changes that will reduce this patch.

Wed, Jun 7, 9:59 AM
RKSimon added inline comments to D33994: [DAGCombiner] Add another combine from build vector to shuffle.
Wed, Jun 7, 9:52 AM
RKSimon edited reviewers for D33994: [DAGCombiner] Add another combine from build vector to shuffle, added: efriedma; removed: eli.friedman.
Wed, Jun 7, 9:48 AM
RKSimon added a comment to D29402: [SLP] Initial rework for min/max horizontal reduction vectorization, NFC..

@ABataev You marked a lot of the comments done but haven't updated the diff.

Wed, Jun 7, 7:50 AM
RKSimon committed rL304911: [DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering..
[DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering.
Wed, Jun 7, 7:05 AM
RKSimon closed D33882: [DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering. by committing rL304911: [DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering..
Wed, Jun 7, 7:05 AM
RKSimon added inline comments to D33099: AMD Jaguar scheduler doesn't correctly model 256-bit AVX instructions.
Wed, Jun 7, 7:04 AM
RKSimon abandoned D32219: [X86][SSE] Improve DIV/SQRT throughput estimates for SB/HW schedule models.

D33897 is moving SNB/HW scheduler table to auto-gen

Wed, Jun 7, 6:53 AM
RKSimon added a comment to D33983: update add\sub costs of vectors of 64 in X86\SLM arch.

You might want to add a slm target to the llvm\test\Transforms\SLPVectorizer\X86\arith-add.ll, arith-mul.ll and arith-sub.ll tests as well.

You mean to split the slm-arith-costs.ll test to those files or to add extra test cases via checking those tests on SLM also?

Wed, Jun 7, 6:31 AM
RKSimon accepted D33862: [x86] avoid flipping sign bits for vector icmp by using known bits.

LGTM

Wed, Jun 7, 6:03 AM
RKSimon added a comment to D33983: update add\sub costs of vectors of 64 in X86\SLM arch.

You might want to add a slm target to the llvm\test\Transforms\SLPVectorizer\X86\arith-add.ll, arith-mul.ll and arith-sub.ll tests as well.

Wed, Jun 7, 5:55 AM
RKSimon updated subscribers of D33983: update add\sub costs of vectors of 64 in X86\SLM arch.

Please don't forget to include llvm-commits as a subscriber to all llvm patches

Wed, Jun 7, 5:51 AM
RKSimon added a comment to D33203: Add scheduler classes to integer/float horizontal operations.

Almost there I think - couple of minors

Wed, Jun 7, 5:49 AM
RKSimon committed rL304894: [X86][SSE] Fix an issue with PEXTRW/PEXTRB indices during shuffle combining.
[X86][SSE] Fix an issue with PEXTRW/PEXTRB indices during shuffle combining
Wed, Jun 7, 3:31 AM