craig.topper (Craig Topper)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 30 2013, 7:58 PM (224 w, 5 d)

Recent Activity

Yesterday

craig.topper added inline comments to D40223: [X86] Control-Flow Enforcement Technology - Shadow Stack support (LLVM side).
Sun, Nov 19, 3:07 PM
craig.topper updated the summary of D40230: Add -mprefer-vector-width driver option and attribute during CodeGen..
Sun, Nov 19, 2:54 PM
craig.topper created D40230: Add -mprefer-vector-width driver option and attribute during CodeGen..
Sun, Nov 19, 2:53 PM
craig.topper created D40228: [Target] Keep the TargetOptions feature list sorted instead of sorting during CodeGen.
Sun, Nov 19, 1:05 PM
craig.topper created D40226: [CodeGen] Move Reciprocals option from TargetOptions to CodeGenOptions.
Sun, Nov 19, 11:28 AM
craig.topper added a comment to D40222: [x86][icelake]BITALG.

If it supports masking we can't use the intrinsic in the tablegen as it would go against our normal lowering of intrinsics.

Sun, Nov 19, 10:47 AM

Sat, Nov 18

craig.topper abandoned D38824: [X86] Synchronize the existing CPU predefined macros with the cases that gcc defines them.

All skylake-avx512 and cannonlake now set corei7 as of r318616. Abandoning this.

Sat, Nov 18, 6:59 PM
craig.topper added a comment to D40213: [x86][icelake]BITALG.

Can you add command lines to vector-tzcnt-128/256/512.ll? We should be able to use popcnt for tzcnt when avx512cd's lzcnt is not available.

Sat, Nov 18, 4:16 PM
craig.topper accepted D40208: [x86][icelake]VNNI.

LGTM

Sat, Nov 18, 10:36 AM

Fri, Nov 17

craig.topper added inline comments to D40206: [x86][icelake]vbmi2.
Fri, Nov 17, 10:45 PM
craig.topper added inline comments to D40206: [x86][icelake]vbmi2.
Fri, Nov 17, 10:42 PM
craig.topper added inline comments to D39688: [Nios2] final infrastructure addition to provide compilation of simple return from a function..
Fri, Nov 17, 11:44 AM

Thu, Nov 16

craig.topper added inline comments to D38313: [InstCombine] Introducing Aggressive Instruction Combine pass.
Thu, Nov 16, 10:50 PM
craig.topper added a comment to D40055: [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes.

Yes I'm going to fix the QQ issue as well.

Thu, Nov 16, 10:11 PM
craig.topper added a comment to D39851: [X86] Add separate intrinsics for scalar FMA4 instructions..

Ping

Thu, Nov 16, 11:45 AM

Wed, Nov 15

craig.topper added inline comments to D40078: [x86][icelake]VAES introduction.
Wed, Nov 15, 5:14 PM
craig.topper requested changes to D40101: [x86][icelake]vpclmulqdq introduction.

Sorry I think there's still a bug.

Wed, Nov 15, 4:05 PM
craig.topper accepted D40101: [x86][icelake]vpclmulqdq introduction.

LGTM

Wed, Nov 15, 4:01 PM
craig.topper added inline comments to D40101: [x86][icelake]vpclmulqdq introduction.
Wed, Nov 15, 2:43 PM
craig.topper accepted D40093: Split x86 "Processor" info into its own def file. [NFC].

LGTM

Wed, Nov 15, 2:23 PM
craig.topper added inline comments to D40078: [x86][icelake]VAES introduction.
Wed, Nov 15, 12:51 PM
craig.topper added a comment to D38566: [SimplifyCFG] don't sink common insts too soon (PR34603).

One issue with the struct approach. Suppose I opt-bisect to a failure caused by one of these simplifyCFG passes. How do I invoke that specific version of simplifycfg from opt to debug it or even produce a reduced test case.

Wed, Nov 15, 11:42 AM
craig.topper added inline comments to D40078: [x86][icelake]VAES introduction.
Wed, Nov 15, 10:02 AM
craig.topper added a reviewer for D40078: [x86][icelake]VAES introduction: craig.topper.
Wed, Nov 15, 9:46 AM

Tue, Nov 14

craig.topper added a comment to D39952: [X86]: Adding full coverage of MC encoding for all X86 ISA Sets.NFC.

I don't recognize some of these

Tue, Nov 14, 11:45 PM
craig.topper accepted D40054: Simplify CpuIs code to use include from LLVM.

LGTM

Tue, Nov 14, 4:01 PM
craig.topper created D40055: [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes.
Tue, Nov 14, 3:43 PM
craig.topper updated the diff for D40035: [LoopRotate] processLoop should return true even if it just simplified the loop latch without making any other changes.

Now with test case.

Tue, Nov 14, 11:37 AM
craig.topper created D40035: [LoopRotate] processLoop should return true even if it just simplified the loop latch without making any other changes.
Tue, Nov 14, 9:46 AM
craig.topper added inline comments to D38313: [InstCombine] Introducing Aggressive Instruction Combine pass.
Tue, Nov 14, 8:59 AM
craig.topper created D40007: [NewPassManager] Pass the -fdebug-pass-manager flag setting into the Analysis managers to match what we do in opt.
Tue, Nov 14, 12:04 AM

Mon, Nov 13

craig.topper created D39999: [InstCombine] Simplify binops that are only used by a select and are fed by a select with the same condition..
Mon, Nov 13, 10:50 PM
craig.topper added inline comments to D39840: [MC][X86] Code padding for performance stability - Branch instructions and targets alignment.
Mon, Nov 13, 7:03 PM
craig.topper planned changes to D37885: [x86] Bring back the MOVZX64rr* pseudo instructions so that they can be coalesced using X86InstrInfo::isCoalescableExtInstr.

I noticed this is missing some load folding of stack reloads in the multiply tests. I'm also not seeing much benefit and a couple small regressions in our benchmark runs. So I'm going to hold off on this.

Mon, Nov 13, 4:48 PM
craig.topper added inline comments to D39720: [X86][AVX512] lowering kunpack intrinsic - llvm part.
Mon, Nov 13, 10:49 AM
craig.topper added inline comments to D39840: [MC][X86] Code padding for performance stability - Branch instructions and targets alignment.
Mon, Nov 13, 12:02 AM

Sun, Nov 12

craig.topper updated the diff for D39927: [X86] Allow X86ISD::Wrapper to be folded into the base of gather/scatter address.

Add a large code model test. Merge 'else' and 'if (matchVectorAddress". Add a comment.

Sun, Nov 12, 11:16 PM
craig.topper added a comment to D39833: [X86][SKX] Adding scheduling info of non-intrinsic + commutable SKX opcodes..

Sorry, I didn't necessarily mean to add MIN/MAX to this patch. I assume they are missing from all scheduling models.

Sun, Nov 12, 10:57 PM
craig.topper added a comment to D39833: [X86][SKX] Adding scheduling info of non-intrinsic + commutable SKX opcodes..

Because I noticed while looking at this. Can you also add the (V)MAXCPD/MAXCPS/MAXCSD/MAXCSS/MINCPD/MINCPS/MINCSD/MINCSS instructions. They're equivalent to their counterparts with out the 'C'. They are part of a hack to make floating point min/max commutable under fast math.

Sun, Nov 12, 1:17 AM
craig.topper accepted D39833: [X86][SKX] Adding scheduling info of non-intrinsic + commutable SKX opcodes..

LGTM

Sun, Nov 12, 1:09 AM

Sat, Nov 11

craig.topper added inline comments to D39927: [X86] Allow X86ISD::Wrapper to be folded into the base of gather/scatter address.
Sat, Nov 11, 1:43 PM

Fri, Nov 10

craig.topper accepted D38737: [X86] test/testn intrinsics lowering to IR. clang side.

LGTM

Fri, Nov 10, 10:19 PM
craig.topper updated the diff for D39766: [InstCombine] Teach visitICmpInst to not break integer absolute value idioms.

Use matchSelectPattern instead. Use user_back instead of user_begin since it does that same thing without the explicit dereference. Also updated the equivalent place in visitFCmp.

Fri, Nov 10, 4:30 PM
craig.topper abandoned D39527: [X86] Add a .def file to contain information about mapping vendor/type/subtype to -march strings.
Fri, Nov 10, 4:15 PM
craig.topper added a comment to D39847: [X86] Avoid unecessary opsize byte in segment move to memory.

gas seems to error on movl %fs,(%rsi) and movl (%rsi), %fs

Fri, Nov 10, 4:14 PM
craig.topper created D39927: [X86] Allow X86ISD::Wrapper to be folded into the base of gather/scatter address.
Fri, Nov 10, 3:26 PM
craig.topper updated the diff for D39911: [SelectionDAG] Teach SelectionDAGBuilder's getUniformBase for gather/scatter handling to accept GEPs with more than 2 operands if the middle operands are all 0s.

Remove the hasIndices check

Fri, Nov 10, 1:41 PM
craig.topper created D39911: [SelectionDAG] Teach SelectionDAGBuilder's getUniformBase for gather/scatter handling to accept GEPs with more than 2 operands if the middle operands are all 0s.
Fri, Nov 10, 11:00 AM
craig.topper added a comment to D39830: [DAGCombine] Transform (A + -2.0*B*C) -> (A - (B+B)*C).

This solution doesn't seem very general, it won't catch.

Fri, Nov 10, 9:29 AM
craig.topper accepted D39596: Allow separation of declarations and definitions in <Target>ISelDAGToDAG.inc.

LGTM

Fri, Nov 10, 9:23 AM

Thu, Nov 9

craig.topper updated the diff for D39851: [X86] Add separate intrinsics for scalar FMA4 instructions..

Remove accidental update

Thu, Nov 9, 11:26 AM
craig.topper updated the diff for D39851: [X86] Add separate intrinsics for scalar FMA4 instructions..
  • Gather optimization
Thu, Nov 9, 11:25 AM
craig.topper updated the summary of D39851: [X86] Add separate intrinsics for scalar FMA4 instructions..
Thu, Nov 9, 10:35 AM
craig.topper created D39851: [X86] Add separate intrinsics for scalar FMA4 instructions..
Thu, Nov 9, 10:34 AM
craig.topper added a comment to D39833: [X86][SKX] Adding scheduling info of non-intrinsic + commutable SKX opcodes..

We should have scheduling information for both forms. They are both used by isel and both make it all the way to code emission.

Thu, Nov 9, 8:13 AM

Tue, Nov 7

craig.topper added reviewers for D39782: [X86] Add a def file to CPU vendor, type, and subtype encodings used by Host.cpp: erichkeane, echristo, asbirlea.
Tue, Nov 7, 11:35 PM
craig.topper updated the diff for D39782: [X86] Add a def file to CPU vendor, type, and subtype encodings used by Host.cpp.

Fix some typos

Tue, Nov 7, 11:34 PM
craig.topper created D39782: [X86] Add a def file to CPU vendor, type, and subtype encodings used by Host.cpp.
Tue, Nov 7, 11:31 PM
craig.topper added reviewers for D39704: [X86] [CodeGen] Compiler not using SHLD/SHRD instructions when doing double shift pattern combine for 16bit or 8bit arguments (PR35155): spatel, RKSimon.
Tue, Nov 7, 9:14 PM
craig.topper added a comment to D39720: [X86][AVX512] lowering kunpack intrinsic - llvm part.

I don't see any tests that produce kunpckbw after this change.

Tue, Nov 7, 9:12 PM
craig.topper added inline comments to D39719: [X86][AVX512] lowering kunpack intrinsic - clang part.
Tue, Nov 7, 9:07 PM
craig.topper updated the diff for D39766: [InstCombine] Teach visitICmpInst to not break integer absolute value idioms.

Add full context

Tue, Nov 7, 4:27 PM
craig.topper created D39766: [InstCombine] Teach visitICmpInst to not break integer absolute value idioms.
Tue, Nov 7, 3:29 PM
craig.topper created D39760: [SimplifyCFG] Teach merge conditional stores to handle cases where the PostBB has more than 2 predecessors by inserting a new block for the store..
Tue, Nov 7, 1:52 PM

Mon, Nov 6

craig.topper added inline comments to D38737: [X86] test/testn intrinsics lowering to IR. clang side.
Mon, Nov 6, 10:59 AM
craig.topper updated the diff for D39222: [InstCombine] Pull shifts through a select plus binop with constant.

Address Sanjay's comments.

Mon, Nov 6, 10:57 AM
craig.topper added inline comments to D38737: [X86] test/testn intrinsics lowering to IR. clang side.
Mon, Nov 6, 10:52 AM
craig.topper accepted D38736: [X86] test/testn intrinsics lowering to IR. llvm part..

LGTM

Mon, Nov 6, 10:45 AM
craig.topper accepted D39631: [X86] Fix the spelling of 3dnow and 3dnowa in isValidFeatureName.

LGTM

Mon, Nov 6, 10:38 AM

Sat, Nov 4

craig.topper added a comment to D39631: [X86] Fix the spelling of 3dnow and 3dnowa in isValidFeatureName.

Ok then we can keep the new test.

Sat, Nov 4, 5:28 PM
craig.topper requested changes to D39631: [X86] Fix the spelling of 3dnow and 3dnowa in isValidFeatureName.

Can we just add -Werror to test/CodeGen/3dnow-builtins.c to test this? I believe it should be throwing a warning currently.

Sat, Nov 4, 3:05 PM
craig.topper accepted D39631: [X86] Fix the spelling of 3dnow and 3dnowa in isValidFeatureName.

LGTM

Sat, Nov 4, 2:58 PM
craig.topper added a comment to D38671: [X86][AVX512] lowering shuffle i/f intrinsic - llvm part.

I believe the X86ISelLowering.cpp changes are no longer necessary after r317403 and r317410. I've taught lowering to prefer the EVEX encoded SHUF instructions by default and then EVEX->VEX will turn them back into VPERM2F128 if they don't end up being masked, using broadcast loads, or using XMM16-31.

Sat, Nov 4, 11:16 AM

Fri, Nov 3

craig.topper added a comment to D39570: [SimplifyCFG] When merging conditional stores, don't count the store we're merging against the PHINodeFoldingThreshold.

This had no affect on the binary output for any of the benchmarks we ran on trunk.

Fri, Nov 3, 11:45 AM
craig.topper added inline comments to D39575: [X86] Add subtarget features prefer-avx256 and prefer-avx128 and use them to limit vector width presented by TTI.
Fri, Nov 3, 10:30 AM
craig.topper added a comment to D39575: [X86] Add subtarget features prefer-avx256 and prefer-avx128 and use them to limit vector width presented by TTI.

In the longer term we may end up making -mprefer-avx256 be on by default for skylake-avx256 so I think a vectorizer specific command line option would prevent us from having that control.

Fri, Nov 3, 10:22 AM

Thu, Nov 2

craig.topper accepted D39585: PR35131 Fix a misprint in CTLZ recognition.

LGTM

Thu, Nov 2, 10:01 PM
craig.topper created D39583: [X86] Don't use RCP14 and RSQRT14 for reciprocal estimations or for legacy SSE rcp/rsqrt intrinsics when AVX512 features are enabled..
Thu, Nov 2, 5:32 PM
craig.topper created D39576: [X86] Add -mprefer-avx256 and -mprefer-avx128 to the clang driver.
Thu, Nov 2, 3:42 PM
craig.topper created D39575: [X86] Add subtarget features prefer-avx256 and prefer-avx128 and use them to limit vector width presented by TTI.
Thu, Nov 2, 3:20 PM
craig.topper created D39570: [SimplifyCFG] When merging conditional stores, don't count the store we're merging against the PHINodeFoldingThreshold.
Thu, Nov 2, 2:38 PM
craig.topper accepted D39546: Fix for Bug 34475 - LOCK/REP/REPNE prefixes emitted as instruction on their own.

LGTM

Thu, Nov 2, 10:12 AM
craig.topper added a comment to D39481: [CodeGen] fix const-ness of builtin equivalents of <math.h> and <complex.h> functions that might set errno.

There's an oddity with fma. The version without __builtin has 'e' already

Thu, Nov 2, 10:10 AM

Wed, Nov 1

craig.topper added a comment to D39527: [X86] Add a .def file to contain information about mapping vendor/type/subtype to -march strings.

I added a CPUISVALID that you should be able to use to know if its valid for builtin_cpu_is. The name for builtin_cpu_is is the first string. Empty string if its not valid. The corresponding march names are the second string. Host.cpp needed to translate to -march names.

Wed, Nov 1, 10:57 PM
craig.topper added a reviewer for D39527: [X86] Add a .def file to contain information about mapping vendor/type/subtype to -march strings: erichkeane.
Wed, Nov 1, 10:53 PM
craig.topper created D39527: [X86] Add a .def file to contain information about mapping vendor/type/subtype to -march strings.
Wed, Nov 1, 10:53 PM
craig.topper added a reviewer for D39521: [x86 TargetInfo] Pull CPU handling for the x86 TargetInfo into a .def file.: RKSimon.
Wed, Nov 1, 4:44 PM

Tue, Oct 31

craig.topper abandoned D35044: [IR] Remove the opcode argument from CmpInst::Create.
Tue, Oct 31, 8:36 PM
craig.topper updated the diff for D39450: [X86] Add AVX512 support to X86FastISel::X86SelectSIToFP.

Remove code that wasn't supposed to be in this review.

Tue, Oct 31, 5:49 PM
craig.topper retitled D39450: [X86] Add AVX512 support to X86FastISel::X86SelectSIToFP from [X86] Add AVX512 support to X86FastISel::fastMaterializeFloatZero. to [X86] Add AVX512 support to X86FastISel::X86SelectSIToFP.
Tue, Oct 31, 5:26 PM
craig.topper added reviewers for D39450: [X86] Add AVX512 support to X86FastISel::X86SelectSIToFP: RKSimon, zvi.
Tue, Oct 31, 12:07 AM
craig.topper created D39450: [X86] Add AVX512 support to X86FastISel::X86SelectSIToFP.
Tue, Oct 31, 12:07 AM

Mon, Oct 30

craig.topper updated the diff for D39402: [X86] Prevent fast isel from folding loads into the instructions listed in hasPartialRegUpdate..

Replicate check instead of moving it. Add optsize fast-isel tests.

Mon, Oct 30, 8:38 PM
craig.topper planned changes to D38824: [X86] Synchronize the existing CPU predefined macros with the cases that gcc defines them.
Mon, Oct 30, 8:19 PM
craig.topper created D39440: [SimplifyCFG] Use a more generic name for the selects created by SpeculativelyExecuteBB to prevent long names from being created.
Mon, Oct 30, 6:16 PM
craig.topper added inline comments to D38832: [X86][SelectionDAG] Add support for simplifying demanded bits of target nodes. Use it to simplify demanded bits of CMOV.
Mon, Oct 30, 11:09 AM
craig.topper added a comment to D37885: [x86] Bring back the MOVZX64rr* pseudo instructions so that they can be coalesced using X86InstrInfo::isCoalescableExtInstr.

Ping

Mon, Oct 30, 11:05 AM
craig.topper added a comment to D39222: [InstCombine] Pull shifts through a select plus binop with constant.

Ping

Mon, Oct 30, 11:05 AM
craig.topper added a comment to D39402: [X86] Prevent fast isel from folding loads into the instructions listed in hasPartialRegUpdate..

I don't think this makes fast isel slower rather it make sure the fast isel entry point does this check. (at least the one used by X86FastISel::tryToFoldLoadIntoMI). So it actually may slow down peephole folding and spill folding. Maybe I should just leave the other checks in place and add a new one?

Mon, Oct 30, 11:04 AM
craig.topper accepted D39383: [InstCombine] Simplify selects that test cmpxchg instructions.

LGTM

Mon, Oct 30, 10:59 AM