craig.topper (Craig Topper)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 30 2013, 7:58 PM (233 w, 1 d)

Recent Activity

Today

craig.topper added a comment to D42088: [x86] shrink 'and' immediate values by setting the high bits (PR35907).

I removed the srl/and reversing transform from X86 to see if we did any better. But end up with an and with 65024 in from of the shift.

Thu, Jan 18, 12:15 AM

Yesterday

craig.topper added a comment to D42205: [X86] Add intrinsic support for the RDPID instruction.

I think we already have the encoding tests for the assembler and disassembler. The instructions were already present before this patch.

Wed, Jan 17, 11:05 PM
craig.topper added inline comments to D41330: [X86] Reduce Store Forward Block issues in HW.
Wed, Jan 17, 5:21 PM
craig.topper added a comment to D42128: [X86] Extend load-op-store fusion merge to ADC/SBB..

Unless I'm crazy the entirety of addcarry2.ll already passes on trunk. I think we hit regular isel patterns unless the carry out of the ADC/SBB is used.

Wed, Jan 17, 4:38 PM
craig.topper added inline comments to D42128: [X86] Extend load-op-store fusion merge to ADC/SBB..
Wed, Jan 17, 4:34 PM
craig.topper created D42205: [X86] Add intrinsic support for the RDPID instruction.
Wed, Jan 17, 3:25 PM
craig.topper added a comment to D41879: [X86] Added support for nocf_check attribute for indirect Branch Tracking.

You can have two instructions use the same encoding if you mark one with isAsmParserOnly = 1. See XACQUIRE_PREFIX and XRELEASE_PREFIX. isAsmParserOnly will hide from the disassembler that checks encoding collisions.

Wed, Jan 17, 2:55 PM
craig.topper updated the diff for D41895: [X86] Another attempt at support prefer-vector-width function attribute.

Fix context

Wed, Jan 17, 2:10 PM
craig.topper updated the diff for D41895: [X86] Another attempt at support prefer-vector-width function attribute.

Rebase to fix the outdated trunc code. Add helper for splitting and extending v16i1 to v16i16/v16i8

Wed, Jan 17, 2:08 PM
craig.topper added inline comments to D41895: [X86] Another attempt at support prefer-vector-width function attribute.
Wed, Jan 17, 1:47 PM
craig.topper updated the diff for D42090: [DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat elemnt is a bitcast from a vector type into a concat_vector.

Address review comments.

Wed, Jan 17, 1:16 PM
craig.topper added inline comments to D42090: [DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat elemnt is a bitcast from a vector type into a concat_vector.
Wed, Jan 17, 11:35 AM
craig.topper closed D42153: [X86] When legalizing (v64i1 select i8, v64i1, v64i1) make sure not to introduce bitcasts to i64 in 32-bit mode.

Fixed in r322724, but forgot to tag it in the commit.

Wed, Jan 17, 10:50 AM
craig.topper added 1 commit(s) for D42153: [X86] When legalizing (v64i1 select i8, v64i1, v64i1) make sure not to introduce bitcasts to i64 in 32-bit mode: rL322724: [X86] When legalizing (v64i1 select i8, v64i1, v64i1) make sure not to….
Wed, Jan 17, 10:50 AM
craig.topper added an edge to rL322724: [X86] When legalizing (v64i1 select i8, v64i1, v64i1) make sure not to…: D42153: [X86] When legalizing (v64i1 select i8, v64i1, v64i1) make sure not to introduce bitcasts to i64 in 32-bit mode.
Wed, Jan 17, 10:50 AM
craig.topper added a comment to D41599: [X86] Lowering X86 avx512 sqrt intrinsics to IR - LLVM.

Looking at clang's CGBuiltin.cpp we do have precedent for using Intrinsic::sqrt for builtins for AArch64, PowerPC, and SystemZ.

Wed, Jan 17, 10:21 AM
craig.topper added a comment to D41895: [X86] Another attempt at support prefer-vector-width function attribute.

Ping

Wed, Jan 17, 10:01 AM
craig.topper added a reviewer for D41895: [X86] Another attempt at support prefer-vector-width function attribute: spatel.
Wed, Jan 17, 10:00 AM
craig.topper updated the diff for D42031: [X86] Legalize v32i1 without BWI via splitting to v16i1 rather than the default of promoting to v32i8..

Add diffs for more v32i1 shuffle test case. I haven't commited the current versions of the tests to the repo yet, but you can see the before and after here.

Wed, Jan 17, 12:55 AM

Tue, Jan 16

craig.topper added a comment to D42031: [X86] Legalize v32i1 without BWI via splitting to v16i1 rather than the default of promoting to v32i8..

While the v32i1 is technically a regression, its not a fair test of what happens with v32i1 shuffles. Previously we were getting lucky with type promotion working favorably with the types used for argument passing. I think a shuffle sandwiched between say an icmp and a select condition would be very different.

Tue, Jan 16, 11:53 PM
craig.topper planned changes to D42091: [X86] Move i1 shuffle legalizing from lowering to DAG combine.
Tue, Jan 16, 11:26 PM
craig.topper added a reviewer for D41879: [X86] Added support for nocf_check attribute for indirect Branch Tracking: RKSimon.
Tue, Jan 16, 9:47 PM
craig.topper added a comment to D41879: [X86] Added support for nocf_check attribute for indirect Branch Tracking.

Should we be printing the DS_PREFIX as "notrack" like gcc?

Tue, Jan 16, 9:46 PM
craig.topper updated the diff for D42090: [DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat elemnt is a bitcast from a vector type into a concat_vector.

With context this time.

Tue, Jan 16, 8:41 PM
craig.topper updated the diff for D42086: [X86] Teach LowerBUILD_VECTOR to recognize pair-wise splats of 32-bit elements and use a 64-bit broadcast.

Address Sanjay's comments.

Tue, Jan 16, 8:34 PM
craig.topper added inline comments to D42154: Don't generate inline atomics for i386/i486.
Tue, Jan 16, 6:55 PM
craig.topper added inline comments to D42154: Don't generate inline atomics for i386/i486.
Tue, Jan 16, 6:28 PM
craig.topper added inline comments to D42154: Don't generate inline atomics for i386/i486.
Tue, Jan 16, 6:23 PM
craig.topper created D42153: [X86] When legalizing (v64i1 select i8, v64i1, v64i1) make sure not to introduce bitcasts to i64 in 32-bit mode.
Tue, Jan 16, 5:28 PM
craig.topper added inline comments to D42123: Derive GEP index type from Data Layout.
Tue, Jan 16, 5:16 PM
craig.topper added inline comments to D42128: [X86] Extend load-op-store fusion merge to ADC/SBB..
Tue, Jan 16, 3:01 PM
craig.topper added a comment to D41233: [InstCombine] Canonizing 'and' before 'shl'.

I know smaller constants is meaningful to X86. But do other targets have different immediate sizes for And instructions. ARM encodes immediates with a shift amount applied to them I think? Not sure about others.

Tue, Jan 16, 2:32 PM
craig.topper added inline comments to D42088: [x86] shrink 'and' immediate values by setting the high bits (PR35907).
Tue, Jan 16, 2:19 PM
craig.topper added inline comments to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.
Tue, Jan 16, 2:00 PM
craig.topper added a comment to D42088: [x86] shrink 'and' immediate values by setting the high bits (PR35907).

I wonder if we should do this in PreprocessISelDAG so the killing off of the AND when the mask is all 1s doesn't seem quite so weird. Right now when it happens the Select call ends up doing selection on the input to the AND.

Tue, Jan 16, 11:53 AM
craig.topper added inline comments to D42088: [x86] shrink 'and' immediate values by setting the high bits (PR35907).
Tue, Jan 16, 10:17 AM

Mon, Jan 15

craig.topper updated the diff for D42031: [X86] Legalize v32i1 without BWI via splitting to v16i1 rather than the default of promoting to v32i8..

Rebase

Mon, Jan 15, 7:22 PM
craig.topper added a dependency for D42091: [X86] Move i1 shuffle legalizing from lowering to DAG combine: D42090: [DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat elemnt is a bitcast from a vector type into a concat_vector.
Mon, Jan 15, 6:32 PM
craig.topper added a dependent revision for D42090: [DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat elemnt is a bitcast from a vector type into a concat_vector: D42091: [X86] Move i1 shuffle legalizing from lowering to DAG combine.
Mon, Jan 15, 6:32 PM
craig.topper created D42091: [X86] Move i1 shuffle legalizing from lowering to DAG combine.
Mon, Jan 15, 6:31 PM
craig.topper added a dependency for D42090: [DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat elemnt is a bitcast from a vector type into a concat_vector: D42086: [X86] Teach LowerBUILD_VECTOR to recognize pair-wise splats of 32-bit elements and use a 64-bit broadcast.
Mon, Jan 15, 5:29 PM
craig.topper added a dependent revision for D42086: [X86] Teach LowerBUILD_VECTOR to recognize pair-wise splats of 32-bit elements and use a 64-bit broadcast: D42090: [DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat elemnt is a bitcast from a vector type into a concat_vector.
Mon, Jan 15, 5:29 PM
craig.topper created D42090: [DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat elemnt is a bitcast from a vector type into a concat_vector.
Mon, Jan 15, 5:26 PM
craig.topper created D42086: [X86] Teach LowerBUILD_VECTOR to recognize pair-wise splats of 32-bit elements and use a 64-bit broadcast.
Mon, Jan 15, 1:58 PM
craig.topper accepted D40879: [X86][I86,I186,I286,I386,I486,PPRO, MMX]: Adding full coverage of MC encoding for the I86, I186, I286, I386, I486, PPRO and MMX isa sets.<NFC>.

LGTM

Mon, Jan 15, 11:43 AM
craig.topper accepted D41282: [X86][XSAVE]: Adding full coverage of MC encoding for the XSAVE isa sets.<NFC>.

LGTM

Mon, Jan 15, 11:43 AM
craig.topper accepted D41523: xmmintrin.h documentation fixes and updates.

LGTM

Mon, Jan 15, 11:39 AM
craig.topper accepted D41908: [X86][MMX] Add support for MMX zero vector creation.

LGTM

Mon, Jan 15, 11:36 AM
craig.topper accepted D42042: [X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873).

LGTM

Mon, Jan 15, 11:32 AM

Sun, Jan 14

craig.topper added a comment to D41794: [X86] Improve AVX1 shuffle lowering for v8f32 shuffles where the low half comes from V1 and the high half comes from V2 and the halves do the same operation.

bYeah there may be some crossover with lowerVectorShuffleByMerging128BitLanes. I'll see if I can generalize lowerVectorShuffleByMerging128BitLanes more.

Sun, Jan 14, 4:10 PM
craig.topper added inline comments to D42042: [X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873).
Sun, Jan 14, 1:14 PM
craig.topper updated the diff for D41794: [X86] Improve AVX1 shuffle lowering for v8f32 shuffles where the low half comes from V1 and the high half comes from V2 and the halves do the same operation.

Actually rebase the patch.

Sun, Jan 14, 1:00 PM
craig.topper updated the diff for D41794: [X86] Improve AVX1 shuffle lowering for v8f32 shuffles where the low half comes from V1 and the high half comes from V2 and the halves do the same operation.

Rebase now that tets have been committed.

Sun, Jan 14, 12:52 PM
craig.topper added inline comments to D42042: [X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873).
Sun, Jan 14, 12:42 PM
craig.topper added inline comments to D42042: [X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873).
Sun, Jan 14, 12:06 PM
craig.topper added inline comments to D42042: [X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873).
Sun, Jan 14, 11:44 AM
craig.topper added inline comments to D42042: [X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873).
Sun, Jan 14, 11:13 AM

Sat, Jan 13

craig.topper updated the diff for D42031: [X86] Legalize v32i1 without BWI via splitting to v16i1 rather than the default of promoting to v32i8..

Rebase after improving vXi16/vXi8 select combines

Sat, Jan 13, 6:19 PM
craig.topper added reviewers for D42031: [X86] Legalize v32i1 without BWI via splitting to v16i1 rather than the default of promoting to v32i8.: zvi, RKSimon, delena, spatel.
Sat, Jan 13, 11:53 AM
craig.topper created D42031: [X86] Legalize v32i1 without BWI via splitting to v16i1 rather than the default of promoting to v32i8..
Sat, Jan 13, 11:52 AM
craig.topper updated the diff for D42018: [X86] Autoupgrade kunpck intrinsics using vector operations instead of scalar operations.

Update with context

Sat, Jan 13, 11:06 AM

Fri, Jan 12

craig.topper updated subscribers of D42016: [X86] Implement old kunpck intrinsics using vector ops on vXi1 instead of integer shift/and/or.
Fri, Jan 12, 4:18 PM
craig.topper created D42018: [X86] Autoupgrade kunpck intrinsics using vector operations instead of scalar operations.
Fri, Jan 12, 4:17 PM
craig.topper created D42016: [X86] Implement old kunpck intrinsics using vector ops on vXi1 instead of integer shift/and/or.
Fri, Jan 12, 4:02 PM
craig.topper updated the diff for D37418: [X86] Use btc/btr/bts to implement xor/and/or that affects a single bit in the upper 32-bits of a 64-bit operation..

Remove the X86TargetTransformInfo.cpp changes and rebase.

Fri, Jan 12, 11:37 AM
craig.topper added a comment to D41908: [X86][MMX] Add support for MMX zero vector creation.

This seems to cause a constant pool to be use for the second _mm_setzero_si64 call.

Fri, Jan 12, 10:46 AM
craig.topper resigned from D35688: More extendable LaneBitmask.
Fri, Jan 12, 9:42 AM
craig.topper updated the diff for D41282: [X86][XSAVE]: Adding full coverage of MC encoding for the XSAVE isa sets.<NFC>.

Remove 64-bit instructions from 32-bit mode tests

Fri, Jan 12, 9:24 AM
craig.topper commandeered D41282: [X86][XSAVE]: Adding full coverage of MC encoding for the XSAVE isa sets.<NFC>.

Commandeering so I can remove the 64-bit instructions from the 32-bit tests.

Fri, Jan 12, 9:23 AM
craig.topper added a comment to D41983: [X86] Fix missing predicates HasAVX512 Predicates in avx512_sqrt_scalar..

I'm not sure there's a functional change here. The instructions that are being modified don't have patterns defined.

Fri, Jan 12, 9:16 AM
craig.topper updated the diff for D40879: [X86][I86,I186,I286,I386,I486,PPRO, MMX]: Adding full coverage of MC encoding for the I86, I186, I286, I386, I486, PPRO and MMX isa sets.<NFC>.

Remove 64-bit mode only instructions from the 32-bit tests. Use movq mnemonic instead of movd for moves between 64-bit GPRs and MMX registers.

Fri, Jan 12, 9:08 AM
craig.topper added a comment to D41962: Fix TestYMMRegisters for older machines without AVX2.

__builtin_cpu_init was added to clang between 5.0 and 6.0

Fri, Jan 12, 9:08 AM

Thu, Jan 11

craig.topper commandeered D40879: [X86][I86,I186,I286,I386,I486,PPRO, MMX]: Adding full coverage of MC encoding for the I86, I186, I286, I386, I486, PPRO and MMX isa sets.<NFC>.

Commandeering so I can rebase this patch

Thu, Jan 11, 8:17 PM
craig.topper accepted D41172: [X86][AVX512F_512]: Adding full coverage of MC encoding for the AVX512F 512 bits isa sets.<NFC>.

LGTM

Thu, Jan 11, 6:02 PM
craig.topper added a comment to D41173: [X86][AVX512F_256]: Adding full coverage of MC encoding for the AVX512F 256 bits isa sets.<NFC>.

There are some VEX encoded instructions in the 64-bit test file. It probably can't be helped in the 32-bit file since you can't use xmm16-31

Thu, Jan 11, 5:59 PM
craig.topper added inline comments to D40776: [X86][AVX512]: Adding full coverage of MC encoding for the AVX512 isa sets (w/o AVX512F).<NFC>.
Thu, Jan 11, 5:52 PM
craig.topper added a comment to D41962: Fix TestYMMRegisters for older machines without AVX2.

I don't know what platforms this needs to support. But __builtin_cpu_support only works when compiled with clang or gcc. And it requires compiler-rt or libgcc. I don't know if that's guaranteed to exist on Windows.

Thu, Jan 11, 1:37 PM
craig.topper added inline comments to D41879: [X86] Added support for nocf_check attribute for indirect Branch Tracking.
Thu, Jan 11, 10:09 AM

Wed, Jan 10

craig.topper accepted D41610: [X86] Implementation of X86Operand::print.

LGTM

Wed, Jan 10, 11:30 PM
craig.topper added a comment to D41794: [X86] Improve AVX1 shuffle lowering for v8f32 shuffles where the low half comes from V1 and the high half comes from V2 and the halves do the same operation.

Can you show what code you expect for those cases and I'll see what I can do?

Wed, Jan 10, 7:37 PM
craig.topper updated the diff for D41895: [X86] Another attempt at support prefer-vector-width function attribute.

Add canExtendTo512DQ and canExtendTo512BW helper methods to encapsulate hasAVX512/hasBWI and prefer vector width check into one place.

Wed, Jan 10, 12:51 PM
craig.topper added inline comments to D40330: Separate ExecutionDepsFix into 4 parts - enable breaking false dependencies for all reg classes..
Wed, Jan 10, 11:54 AM
craig.topper updated the diff for D40055: [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes.

Fix the calls in DAG combine and add test cases for it.

Wed, Jan 10, 10:17 AM
craig.topper added inline comments to D41908: [X86][MMX] Add support for MMX zero vector creation.
Wed, Jan 10, 9:41 AM
craig.topper updated the diff for D40055: [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes.

Remove unneeded line that snuck in during rebase.

Wed, Jan 10, 9:28 AM

Tue, Jan 9

craig.topper created D41895: [X86] Another attempt at support prefer-vector-width function attribute.
Tue, Jan 9, 6:02 PM
craig.topper abandoned D41096: [X86] Initial support for prefer-vector-width function attribute.
Tue, Jan 9, 1:41 PM
craig.topper updated the diff for D40055: [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes.

Rebase and add the asserts Simon requested

Tue, Jan 9, 1:22 PM
craig.topper added inline comments to D41610: [X86] Implementation of X86Operand::print.
Tue, Jan 9, 10:42 AM
craig.topper resigned from D41863: [AArch64] Fix incorrect LD1 of 16-bit FP vectors in big endian.
Tue, Jan 9, 9:54 AM

Mon, Jan 8

craig.topper updated the diff for D41850: [X86] Add a DAG combine to combine (sext (setcc)) with VLX.

Also block SETNE since that requires an XOR with all 1s to implement.

Mon, Jan 8, 10:40 PM
craig.topper created D41850: [X86] Add a DAG combine to combine (sext (setcc)) with VLX.
Mon, Jan 8, 10:31 PM
craig.topper added inline comments to D41517: mmintrin.h documentation fixes and updates.
Mon, Jan 8, 4:30 PM
craig.topper added a comment to D40055: [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes.

@hfinkel do you have any opinions on this?

Mon, Jan 8, 4:05 PM
craig.topper added a comment to D41833: [lli] Make lli support -mcpu=native for CPU autodetection.

Do you have any suggestions of how to do that? Not sure what features I could guarantee from the build bots.

Mon, Jan 8, 12:57 PM
craig.topper created D41833: [lli] Make lli support -mcpu=native for CPU autodetection.
Mon, Jan 8, 12:37 PM
craig.topper retitled D41341: [X86] Disable 512-bit vectors during type legalization for prefer-vector-width from [X86] WIP disable 512-bit vectors during type legalization for prefer-vector-width to [X86] Disable 512-bit vectors during type legalization for prefer-vector-width.
Mon, Jan 8, 10:24 AM
craig.topper added inline comments to D41811: X86: Add pattern matching for PMADDWD.
Mon, Jan 8, 9:38 AM
craig.topper added a comment to D38313: [InstCombine] Introducing Aggressive Instruction Combine pass.

I think the patch looks good now with the vector fix. Did you hear anything from @mzolotukhin about compile time?

Mon, Jan 8, 9:26 AM
craig.topper updated the diff for D41062: [X86] Legalize v2i32 via widening rather than promoting.

Rebase on top of other recent changes.

Mon, Jan 8, 12:46 AM