- User Since
- May 5 2014, 7:26 AM (163 w, 4 d)
D34472 is providing a more general solution
Some minor comments.
Add goldmount to llvm\test\CodeGen\X86\cpus.ll
- test/CodeGen/X86/clear_upper_vector_element_bits.ll - DAG is improved, but scheduling causes a code size degradation. This is probably okay.
Wed, Jun 21
@t.p.northover Are you happy for the AArch64 codegen regression to be raised as a bug?
LGTM. Please do the peekThroughBitcast helper as a NFC pre-commit.
While I don't want to end up forcing you to over-engineer this, it seems this pass is currently very specific and should be generalised to make it straightforward to work for constants arrays of other patterns that are easily re-materializable (upper/lower masks, single bit masks, etc.) - "SubstituteLoadWithRematerializationPass", possibly driven by the TTI cost models?
Does anyone have any concerns about this? Otherwise I'd like to accept it.
LGTM, with the addition of some ctpop vector tests, even if they don't currently combine.
LGTM with one minor about the comment in the lv test
Tue, Jun 20
Mon, Jun 19
Any chance of a test case?
Thu, Jun 15
Some minor observations. It'd be great it we could reduce the size of this patch some how....
@aymanmus I'm sorry but I had to revert this at rL305470 as it's causing tablegen crashes on windows buildbots: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/10612
Updated as requested by @chandlerc - my only concern is I have no idea if uniform_int_distribution guarantees the same behaviour on different targets as mersene does.
Wed, Jun 14
Tue, Jun 13
Added assertion that the random value doesn't exceed the maximum.
Mon, Jun 12
A few minors, but I'd prefer someone more knowledgable in this area to review it.
Sun, Jun 11
Sat, Jun 10
Rebaing - this is part of constant canonicalizations to try and combine MUL/SHL ops separated by ADD/OR ops.
I've added the SLP vectorization tests at rL305151
Looking at lowerV2X128VectorShuffle, shuffle combining will have a much easier time if we keep to 256-bit vectors (blends / X86ISD::VPERM2X128) as much as possible - subvector extract/insert chains makes combining really tricky - and this dealing with memory cases looks like a good first step.
If possible I'd like to see a regular llc test as well as the mir test.
A few minors, but would like to see ARM fixed first.
Fri, Jun 9
Thu, Jun 8
Wed, Jun 7
A couple of NFC changes that will reduce this patch.
@ABataev You marked a lot of the comments done but haven't updated the diff.
D33897 is moving SNB/HW scheduler table to auto-gen
You might want to add a slm target to the llvm\test\Transforms\SLPVectorizer\X86\arith-add.ll, arith-mul.ll and arith-sub.ll tests as well.
Please don't forget to include llvm-commits as a subscriber to all llvm patches
Almost there I think - couple of minors