craig.topper (Craig Topper)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 30 2013, 7:58 PM (268 w, 2 d)

Recent Activity

Tue, Sep 18

craig.topper added a comment to D52070: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.

@spatel, @dmgreen I’m away from a computer for a couple more days if one of you want to land this in the mean time

Tue, Sep 18, 11:25 AM

Sun, Sep 16

craig.topper added inline comments to D35035: [InstCombine] Prevent memcpy generation for small data size.
Sun, Sep 16, 6:02 PM
craig.topper added a comment to D48491: [X86] Select BEXTR when there is only BMI1..

Can we DAG combine to X86ISD::BEXTR for these cases? Then we can properly check one use of the inner parts of the pattern. The other option might be to use PatFrags that check for one use like the existing srl_su fragment. My concern is that if the inner nodes of the patterns here have more uses, the emitted pattern won't completely remove the need for the inner node. So you'll end up emitting multiple instructions for this pattern and additional instructions for the remaining use of the inner node.

Sun, Sep 16, 5:56 PM

Fri, Sep 14

craig.topper created D52134: [X86] Remove an fp->int->fp domain crossing in LowerUINT_TO_FP_i64..
Fri, Sep 14, 11:22 PM
craig.topper updated the diff for D52075: [InstCombine] Support (sub (sext x), (sext y)) --> (sext (sub x, y)) and (sub (zext x), (zext y)) --> (zext (sub x, y)).

Rebase to use the code from r342292.

Fri, Sep 14, 10:08 PM
craig.topper updated the diff for D52121: [X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)).

Punt on vXi8 types for now. Add FIXMEs.

Fri, Sep 14, 3:37 PM
craig.topper added a comment to D52121: [X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)).

Not sure how D38128 would help. Isn't that about register allocation?

Fri, Sep 14, 3:16 PM
craig.topper added inline comments to D52121: [X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)).
Fri, Sep 14, 3:12 PM
craig.topper updated the diff for D52121: [X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)).

Or I failed to get the file in the diff at all.

Fri, Sep 14, 2:00 PM
craig.topper updated the diff for D52121: [X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)).

Use &&

Fri, Sep 14, 1:47 PM
craig.topper added inline comments to D52109: [TwoAddressInstructionPass] Don't update SrcRegMap for copies inserted for tied register constraint when the src isn't killed.
Fri, Sep 14, 1:45 PM
craig.topper created D52121: [X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)).
Fri, Sep 14, 1:19 PM
craig.topper updated the summary of D52109: [TwoAddressInstructionPass] Don't update SrcRegMap for copies inserted for tied register constraint when the src isn't killed.
Fri, Sep 14, 11:25 AM
craig.topper updated subscribers of D52109: [TwoAddressInstructionPass] Don't update SrcRegMap for copies inserted for tied register constraint when the src isn't killed.
Fri, Sep 14, 11:21 AM
craig.topper created D52109: [TwoAddressInstructionPass] Don't update SrcRegMap for copies inserted for tied register constraint when the src isn't killed.
Fri, Sep 14, 11:21 AM
craig.topper accepted D52043: [X86][SSE] Lower shuffles to permute(unpack(x,y)) (PR31151).

LGTM

Fri, Sep 14, 10:18 AM
craig.topper created D52075: [InstCombine] Support (sub (sext x), (sext y)) --> (sext (sub x, y)) and (sub (zext x), (zext y)) --> (zext (sub x, y)).
Fri, Sep 14, 12:01 AM

Thu, Sep 13

craig.topper created D52070: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.
Thu, Sep 13, 9:42 PM
craig.topper created D52063: [X86] Fix inline expansion for memset in x32.
Thu, Sep 13, 3:06 PM
craig.topper accepted D51829: [MachineInstr] In addRegisterKilled and addRegisterDead, don't remove operands from inline assembly instructions if they have an associated flag operand..

I'm going to assume Reid meant to hit approve instead of changes required.

Thu, Sep 13, 1:44 PM

Wed, Sep 12

craig.topper updated the diff for D51964: [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible..

Fix the typo in the comment. I missed it earlier in my haste to board a plane.

Wed, Sep 12, 11:21 PM
craig.topper added a comment to D51964: [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible..

This whole patch does fix the infinite loop from PR38915, but it requires the whole patch and not just the change in InstCombineSelect.cpp

Wed, Sep 12, 11:20 PM
craig.topper added a comment to D51964: [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible..

I don’t know if it fixes that PR. This transform causes an infinite loop on the tests where both max/min operands are invertible but not constants. Test50 and test51

Wed, Sep 12, 4:02 PM
craig.topper updated the diff for D51964: [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible..

More test cases and fix an infinite loop.

Wed, Sep 12, 9:32 AM

Tue, Sep 11

craig.topper created D51964: [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible..
Tue, Sep 11, 11:30 PM
craig.topper updated the diff for D51893: [X86] Teach X86SelectionDAGInfo::EmitTargetCodeForMemcpy about GNUX32.

Use 0x67 address size prefix for x32. Don't disable movsq with x32.

Tue, Sep 11, 4:55 PM
craig.topper added a comment to D51893: [X86] Teach X86SelectionDAGInfo::EmitTargetCodeForMemcpy about GNUX32.

I believe we have the same bug for memset as well, but I'd prefer to tackle that as a follow up since I don't have a test case yet.

Tue, Sep 11, 4:55 PM
craig.topper added inline comments to D51893: [X86] Teach X86SelectionDAGInfo::EmitTargetCodeForMemcpy about GNUX32.
Tue, Sep 11, 12:25 PM
craig.topper created D51940: [X86] Teach X86FastISel::X86SelectRet to use EAX for the sret pointer in GNUX32.
Tue, Sep 11, 10:33 AM
craig.topper created D51938: [InstCombine] Fix incorrect usage of getPrimitiveSizeInBits when we should be using the element size for vectors.
Tue, Sep 11, 10:08 AM
craig.topper accepted D51433: [InstCombine] enhance vector demanded elements to look at a vector select condition operand.

LGTM

Tue, Sep 11, 9:53 AM

Mon, Sep 10

craig.topper updated the diff for D51829: [MachineInstr] In addRegisterKilled and addRegisterDead, don't remove operands from inline assembly instructions if they have an associated flag operand..

Add the test case from the description.

Mon, Sep 10, 9:58 PM
craig.topper updated the summary of D51829: [MachineInstr] In addRegisterKilled and addRegisterDead, don't remove operands from inline assembly instructions if they have an associated flag operand..
Mon, Sep 10, 9:46 PM
craig.topper updated the diff for D49499: [X86] Prefer unpckhpd over movhlps in isel for fake unary cases.

Rebase

Mon, Sep 10, 8:10 PM
craig.topper abandoned D46711: [private] Add min_vector_width function attribute. Use it to annotate all of the x86 intrinsic header files. Emit a attribute in IR.
Mon, Sep 10, 7:35 PM
craig.topper created D51900: [InstCombine] Support (mul (sext x), cst) --> (sext (mul x, cst')) and (mul (zext x), cst) --> (zext (mul x, cst')) for vectors constants..
Mon, Sep 10, 5:02 PM
craig.topper created D51893: [X86] Teach X86SelectionDAGInfo::EmitTargetCodeForMemcpy about GNUX32.
Mon, Sep 10, 3:25 PM
craig.topper added a comment to D43515: More math intrinsics for conservative math handling.

Adding new constrained instrinsics and adding the pass should be separate patches I think. Changing the syntax of frem should be another patch.

Mon, Sep 10, 10:31 AM
craig.topper added inline comments to D51433: [InstCombine] enhance vector demanded elements to look at a vector select condition operand.
Mon, Sep 10, 10:26 AM

Sun, Sep 9

craig.topper updated the diff for D41062: [X86] Legalize v2i32 via widening rather than promoting.

Rebase

Sun, Sep 9, 7:51 PM

Sat, Sep 8

craig.topper updated the diff for D51754: [X86] Remove isel patterns for ADCX instruction.

Rebase after the other ADC changes that have gone in recently. Add a non-ADX command line to the stack folding test.

Sat, Sep 8, 12:22 PM

Fri, Sep 7

craig.topper created D51829: [MachineInstr] In addRegisterKilled and addRegisterDead, don't remove operands from inline assembly instructions if they have an associated flag operand..
Fri, Sep 7, 5:47 PM
craig.topper added a comment to D51325: [X86] Type legalize v2i32 div/rem by scalarizing rather than promoting.

This is kind similar, but here the issue isn't the number of scalar operations, it's the size of the scalar operations. This probably also applies to v2i8 and v2i16 and any other v2iX type where X is 32 or less.

Fri, Sep 7, 3:52 PM
craig.topper created D51818: [X86] Create paddus/psubus from narrower vectors with i8/i16 element types..
Fri, Sep 7, 2:40 PM
craig.topper added a comment to D51165: [CodeGen] emit inline asm clobber list warnings for reserved (cont).

For the failure I posted. It looks like the Live Variable Analysis Pass is removing a register operand for a clobber without removing the immediate operand preceeding that contains the inline assembly flags that indicate it was a clobber. So later we see the clobber flag, but the register operand after it is missing.

Fri, Sep 7, 11:51 AM
craig.topper added a comment to D51165: [CodeGen] emit inline asm clobber list warnings for reserved (cont).

We're seeing a failure from this on this test case. Looks like somehow the operand after an InlineAsm::Kind_Clobber has become an immediate rather than a register. So the call to getReg fails.

Fri, Sep 7, 11:20 AM
craig.topper created D51805: [X86] Custom emit __builtin_rdtscp so we can emit an explicit store for the out parameter.
Fri, Sep 7, 11:02 AM
craig.topper created D51803: [X86] Modify the the rdtscp intrinsic to return values instead of taking a pointer argument.
Fri, Sep 7, 11:01 AM

Thu, Sep 6

craig.topper created D51771: [X86] Modify addcarry/subborrow builtins to emit an 2 result and intrinsic and an store instruction..
Thu, Sep 6, 11:24 PM
craig.topper created D51769: [X86] Change the addcarry and subborrow intrinsics to return 2 results and remove the pointer argument..
Thu, Sep 6, 9:33 PM
craig.topper created D51768: [X86] Teach X86DAGToDAGISel::foldLoadStoreIntoMemOperand to handle loads in operand 1 of commutable operations..
Thu, Sep 6, 8:13 PM
craig.topper created D51754: [X86] Remove isel patterns for ADCX instruction.
Thu, Sep 6, 2:32 PM
craig.topper added a comment to D51325: [X86] Type legalize v2i32 div/rem by scalarizing rather than promoting.

Ping

Thu, Sep 6, 1:11 PM
craig.topper updated the diff for D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.

Pass a parameter to the lambda to indicate the LHS/RHS are swapped. Check this with an assert.

Thu, Sep 6, 10:14 AM
craig.topper added a comment to D51510: Move AESNI generation to Skylake and Goldmont.

Do you have commit access, or do you need someone to commit this for you?

Thu, Sep 6, 9:50 AM

Mon, Sep 3

craig.topper added a reviewer for D51599: test-suite: add avx512 tests with move-load-store intrinsics: RKSimon.
Mon, Sep 3, 7:28 PM
craig.topper added a comment to D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.

So does this look ok?

Mon, Sep 3, 7:28 PM

Sat, Sep 1

craig.topper accepted D51565: Move FeatureAES from SLM, WSM and SNB to GLM and SKL.

LGTM

Sat, Sep 1, 4:15 AM

Thu, Aug 30

craig.topper accepted D51510: Move AESNI generation to Skylake and Goldmont.

LGTM. Can you update lib/Target/X86/X86.td in LLVM repo as well?

Thu, Aug 30, 3:15 PM
craig.topper added a reviewer for D51510: Move AESNI generation to Skylake and Goldmont: craig.topper.
Thu, Aug 30, 3:13 PM
craig.topper updated the diff for D51325: [X86] Type legalize v2i32 div/rem by scalarizing rather than promoting.

Disable the new custom scalarizing if we are legalizing v2i32 via widening instead or promoting. The generic type legalizer knows to scalarize v2i32 div/rem in that case since it can't widen a trapping operation.

Thu, Aug 30, 1:16 PM
craig.topper added a comment to D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.

@davidxl, this transform doesn't really know if its changing the compare predicate because the min/max matching considers (select (icmp slt X, ~C), ~X, C) to be equivalent to (select (icmp sgt ~X, C), ~X, C). And it only returns the select operands to the calling code. So this transform rewrites the icmp, but doesn't know which form the original compare took.

Thu, Aug 30, 12:14 PM
craig.topper updated the diff for D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.

Now with lamdba

Thu, Aug 30, 11:38 AM
craig.topper requested review of D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.
Thu, Aug 30, 11:38 AM
craig.topper added a comment to D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.

LHS/RHS are set based on the select operands not the compare operands aren't they?

Thu, Aug 30, 11:05 AM
craig.topper updated the diff for D51401: [X86] Add support for turning vXi1 shuffles into KSHIFTL/KSHIFTR..

Address review comments. Rebase after pre-committing new tests.

Thu, Aug 30, 11:03 AM
craig.topper updated the diff for D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.

Address review comments.

Thu, Aug 30, 12:21 AM

Wed, Aug 29

craig.topper updated the diff for D51401: [X86] Add support for turning vXi1 shuffles into KSHIFTL/KSHIFTR..

Add test cases for KNL and SKX. The KNL code is pretty terrible on anything but v16i1, but that's to be expected since we don't have the v32i1 or v64i1 types on KNL or an 8-bit kshift. The intrinsics that will be added for clang will only use 8, 32, and 64 on cpus with avx512dq/avx512bw so this isn't an issue that needs to be addressed immediately.

Wed, Aug 29, 2:49 PM
craig.topper updated the diff for D51231: [X86] Make Feature64Bit useful.

Forgot to drop an InstCombine patch from my tree before uploading that last patch.

Wed, Aug 29, 2:08 PM
craig.topper updated the diff for D51231: [X86] Make Feature64Bit useful.

Updated cpus.ll test to also check for this new error. Had to add a function body to get it to create the X86Subtarget correctly to make the check and had to fix some of the old RUN lines to not pass a 64-bit triple with a 32-bit cpu when they expect no error.

Wed, Aug 29, 1:52 PM
craig.topper added a comment to D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.

I did briefly look at always hoisting it, but it potentially changes the critical path so I'm playing it safe.

Wed, Aug 29, 1:17 PM
craig.topper updated the diff for D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.

Add comment. Use existing createMinMax helper to shorten some code.

Wed, Aug 29, 1:15 PM
craig.topper updated the diff for D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.
Wed, Aug 29, 12:40 PM
craig.topper added a comment to D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.

Looking into the more general fold. Trying to figure out the right magic to avoid an infinite loop.

Wed, Aug 29, 12:20 PM
craig.topper accepted D50070: [X86] Improved sched models for X86 CMPXCHG* instructions.

LGTM

Wed, Aug 29, 12:18 PM
craig.topper added a reviewer for D51231: [X86] Make Feature64Bit useful: efriedma.
Wed, Aug 29, 12:18 PM

Tue, Aug 28

craig.topper created D51401: [X86] Add support for turning vXi1 shuffles into KSHIFTL/KSHIFTR..
Tue, Aug 28, 11:53 PM
craig.topper created D51398: [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible.
Tue, Aug 28, 6:21 PM
craig.topper updated the diff for D51337: [X86] Support v2i32 gather/scatter indices with -x86-experimental-vector-widening-legalization.

I think I accidentally merged by commits on my tree and picked up some extra files.

Tue, Aug 28, 5:05 PM
craig.topper updated the diff for D51337: [X86] Support v2i32 gather/scatter indices with -x86-experimental-vector-widening-legalization.

Take Eli's suggestion and only widen the index. This fixes the scatter/gather index only widening cases to use the better instruction. The default 'promote' case is still promoting to a v2i64 index type since that's what type legalization will naturally do.

Tue, Aug 28, 5:02 PM
craig.topper accepted D50636: [DAGCombiner] Add X / X -> 1 & X % X -> 0 folds..

LGTM

Tue, Aug 28, 3:44 PM
craig.topper created D51370: [X86] Add intrinsics for KADD instructions.
Tue, Aug 28, 11:00 AM
craig.topper updated the diff for D51325: [X86] Type legalize v2i32 div/rem by scalarizing rather than promoting.

Rebase after fixing tets

Tue, Aug 28, 8:27 AM

Mon, Aug 27

craig.topper created D51337: [X86] Support v2i32 gather/scatter indices with -x86-experimental-vector-widening-legalization.
Mon, Aug 27, 9:49 PM
craig.topper added a comment to D50636: [DAGCombiner] Add X / X -> 1 & X % X -> 0 folds..

@jonpa, it looks like changing the urem to this will trigger the original bug and avoid the optimization Simon is adding here.

Mon, Aug 27, 8:02 PM
craig.topper added inline comments to D50070: [X86] Improved sched models for X86 CMPXCHG* instructions.
Mon, Aug 27, 7:14 PM
craig.topper updated the diff for D51236: [InstCombine] Extend (add (sext x), cst) --> (sext (add x, cst')) and (add (zext x), cst) --> (zext (add x, cst')) to work for vectors.

Switch to m_Constant

Mon, Aug 27, 4:15 PM
craig.topper updated the diff for D51325: [X86] Type legalize v2i32 div/rem by scalarizing rather than promoting.

Fix 32-bit and 64-bit check line swap

Mon, Aug 27, 2:55 PM
craig.topper added a comment to D51325: [X86] Type legalize v2i32 div/rem by scalarizing rather than promoting.

Doh! I got my check-prefixes reversed. Let me fix that.

Mon, Aug 27, 2:35 PM
craig.topper created D51325: [X86] Type legalize v2i32 div/rem by scalarizing rather than promoting.
Mon, Aug 27, 2:31 PM

Sun, Aug 26

craig.topper updated the diff for D51284: [X86] When lowering v32i8 MULHS/MULHU, shuffle after the PACKUS rather than before..

Missed a test case

Sun, Aug 26, 11:55 PM
craig.topper created D51284: [X86] When lowering v32i8 MULHS/MULHU, shuffle after the PACKUS rather than before..
Sun, Aug 26, 11:49 PM

Sat, Aug 25

craig.topper updated the diff for D41062: [X86] Legalize v2i32 via widening rather than promoting.

Address a review comment.

Sat, Aug 25, 11:07 PM
craig.topper updated the diff for D41062: [X86] Legalize v2i32 via widening rather than promoting.

Rebase

Sat, Aug 25, 11:06 PM
craig.topper created D51267: [X86] Correct the cost of (v4i32 (fptoui (v4f64))) under AVX512F..
Sat, Aug 25, 10:00 PM
craig.topper updated the diff for D50491: [DAGCombiner][AMDGPU][Mips] Fold bitcast with volatile loads if the resulting load is legal for the target..

Add comment

Sat, Aug 25, 8:03 PM
craig.topper added reviewers for D50491: [DAGCombiner][AMDGPU][Mips] Fold bitcast with volatile loads if the resulting load is legal for the target.: atanasyan, arsenm.
Sat, Aug 25, 8:03 PM
craig.topper created D51264: [X86] Add FeatureCMOV to athlon and athlon-tbird cpus..
Sat, Aug 25, 7:02 PM
craig.topper updated the diff for D50952: [X86] Add support for matching paddus patterns where one of the vectors is a constant..

Rebase after committing test cases

Sat, Aug 25, 5:41 PM
craig.topper added inline comments to D51254: [X86] Replace support for vXi32 SMUL_LOHI/UMUL_LOHI with MULHS/MULHU support instead..
Sat, Aug 25, 10:57 AM