Page MenuHomePhabricator
Feed Advanced Search

Yesterday

spatel accepted D67363: [BreakFalseDeps] ignore function with minsize attribute.

Marking patch as accepted based on "looks good" and "looks fine". I'll hold off on commit for a ~day in case there are any more comments.

Sun, Sep 22, 8:43 AM · Restricted Project
spatel committed rGeb8d39e11315: [InstCombine] allow icmp+binop folds before min/max bailout (PR43310) (authored by spatel).
[InstCombine] allow icmp+binop folds before min/max bailout (PR43310)
Sun, Sep 22, 7:34 AM
spatel committed rL372510: [InstCombine] allow icmp+binop folds before min/max bailout (PR43310).
[InstCombine] allow icmp+binop folds before min/max bailout (PR43310)
Sun, Sep 22, 7:30 AM
spatel committed rGd2a524288d11: [InstCombine] add tests for icmp fold hindered by min/max; NFC (authored by spatel).
[InstCombine] add tests for icmp fold hindered by min/max; NFC
Sun, Sep 22, 7:25 AM
spatel committed rL372509: [InstCombine] add tests for icmp fold hindered by min/max; NFC.
[InstCombine] add tests for icmp fold hindered by min/max; NFC
Sun, Sep 22, 7:21 AM

Sat, Sep 21

spatel added a comment to D67800: [InstCombine] Fold a shifty implementation of clamp positive to allOnesValue..

clamp255: # @clamp255

cmpl    $256, %edi              # imm = 0x100
movl    $255, %eax
cmovll  %edi, %eax
movzbl  %al, %eax
retq
Sat, Sep 21, 11:24 AM · Restricted Project
spatel added a comment to D67799: [InstCombine] Fold a shifty implementation of clamp negative to zero..

Please change clamp0 everywhere to clamp negative to zero, it wasn't obvious to what clamp0 means until reading all of the patch.
This looks ok otherwise. Please wait for @spatel to comment.

For X86, AArch64 and ARM target, backend produce better ASM with this transformation. Please refer to below examples:

I'd agree. @spatel ?

Sat, Sep 21, 11:06 AM · Restricted Project

Fri, Sep 20

spatel updated the diff for D67363: [BreakFalseDeps] ignore function with minsize attribute.

Patch updated:
Added a TODO comment about using a splat load on the ARM tests.

Fri, Sep 20, 12:35 PM · Restricted Project
spatel added a comment to D51701: ValueTracking: Report fast math flags for fcmp/select.

Since select should have FMF now anyway, I don't think this is needed anymore

Fri, Sep 20, 12:29 PM
spatel accepted D67677: [InstCombine] dropRedundantMaskingOfLeftShiftInput(): pat. a/b with mask (PR42563).

LGTM

Fri, Sep 20, 10:42 AM · Restricted Project
spatel added a comment to D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine.

Is this similar to D42981?

Fri, Sep 20, 10:14 AM · Restricted Project
spatel added a comment to D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine.

We decided that load combining was unsuitable for IR because it could obscure other optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend.

For reference, can link those patches/discussions?

Fri, Sep 20, 10:06 AM · Restricted Project
spatel created D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine.
Fri, Sep 20, 9:46 AM · Restricted Project
spatel added a comment to D67799: [InstCombine] Fold a shifty implementation of clamp negative to zero..

We need to confirm that the backend produces better asm for at least a few in-tree targets before/after this transform. Please attach output for x86 and AArch64. We'll want to have examples for scalar and vector code, so you probably need to suppress the vectorizers.

Fri, Sep 20, 7:35 AM · Restricted Project
spatel committed rG4896f7243d62: [SLPVectorizer] add tests for bogus reductions; NFC (authored by spatel).
[SLPVectorizer] add tests for bogus reductions; NFC
Fri, Sep 20, 7:21 AM
spatel committed rL372393: [SLPVectorizer] add tests for bogus reductions; NFC.
[SLPVectorizer] add tests for bogus reductions; NFC
Fri, Sep 20, 7:20 AM

Thu, Sep 19

spatel committed rG13e71ce69319: [Float2Int] avoid crashing on unreachable code (PR38502) (authored by spatel).
[Float2Int] avoid crashing on unreachable code (PR38502)
Thu, Sep 19, 9:32 AM
spatel committed rL372339: [Float2Int] avoid crashing on unreachable code (PR38502).
[Float2Int] avoid crashing on unreachable code (PR38502)
Thu, Sep 19, 9:30 AM
spatel closed D67766: [Float2Int] avoid crashing on unreachable code (PR38502).
Thu, Sep 19, 9:30 AM · Restricted Project
spatel added inline comments to D67677: [InstCombine] dropRedundantMaskingOfLeftShiftInput(): pat. a/b with mask (PR42563).
Thu, Sep 19, 8:42 AM · Restricted Project
spatel created D67766: [Float2Int] avoid crashing on unreachable code (PR38502).
Thu, Sep 19, 7:59 AM · Restricted Project
spatel accepted D67557: [DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook (PR42863).

LGTM

Thu, Sep 19, 7:15 AM · Restricted Project
spatel committed rG7592e3a81fc0: [Float2Int] auto-generate complete test checks; NFC (authored by spatel).
[Float2Int] auto-generate complete test checks; NFC
Thu, Sep 19, 6:59 AM
spatel committed rL372324: [Float2Int] auto-generate complete test checks; NFC.
[Float2Int] auto-generate complete test checks; NFC
Thu, Sep 19, 6:56 AM
spatel accepted D67711: [DAG] Add SelectionDAG::MaxRecursionDepth constant.

LGTM

Thu, Sep 19, 5:00 AM · Restricted Project
spatel added a comment to D67721: [InstSimplify] fold fma/fmuladd with a NaN operand.
D67553 adds `SimplifyFMAMul`. We should probably also have them there. Also, could we make use of multiply by 0.0 here, with the required fast-math flags?
Thu, Sep 19, 4:23 AM · Restricted Project

Wed, Sep 18

spatel created D67721: [InstSimplify] fold fma/fmuladd with a NaN operand.
Wed, Sep 18, 10:40 AM · Restricted Project
spatel committed rGe406a3f2d64c: [InstSimplify] add tests for fma/fmuladd; NFC (authored by spatel).
[InstSimplify] add tests for fma/fmuladd; NFC
Wed, Sep 18, 10:30 AM
spatel committed rL372236: [InstSimplify] add tests for fma/fmuladd; NFC.
[InstSimplify] add tests for fma/fmuladd; NFC
Wed, Sep 18, 10:25 AM
spatel committed rGd46bf63fbbad: [SimplifyLibCalls] fix crash with empty function name (PR43347) (authored by spatel).
[SimplifyLibCalls] fix crash with empty function name (PR43347)
Wed, Sep 18, 7:33 AM
spatel committed rL372227: [SimplifyLibCalls] fix crash with empty function name (PR43347).
[SimplifyLibCalls] fix crash with empty function name (PR43347)
Wed, Sep 18, 7:32 AM
spatel added inline comments to D67557: [DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook (PR42863).
Wed, Sep 18, 5:50 AM · Restricted Project

Tue, Sep 17

spatel added reviewers for D67564: [IR] allow fast-math-flags on phi of FP values: arsenm, jmolloy.

LGTM

Tue, Sep 17, 11:58 AM · Restricted Project
spatel added a reviewer for D67363: [BreakFalseDeps] ignore function with minsize attribute: greened.

Ping - ARM ok?

Tue, Sep 17, 7:51 AM · Restricted Project

Mon, Sep 16

spatel added a comment to D61675: [WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator.

What targets does clang enable FTZ/DAZ on? I don't think it does on X86.

Mon, Sep 16, 11:49 AM · Restricted Project
spatel committed rG3961a143e13a: [InstCombine] remove unneeded one-use checks for icmp fold (authored by spatel).
[InstCombine] remove unneeded one-use checks for icmp fold
Mon, Sep 16, 9:15 AM
spatel committed rL372007: [InstCombine] remove unneeded one-use checks for icmp fold.
[InstCombine] remove unneeded one-use checks for icmp fold
Mon, Sep 16, 9:14 AM
spatel accepted D61675: [WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator.

On 2nd thought, that doesn't really make sense for CSE. The real question is whether we should canonicalize the binary fneg to unary fneg in instcombine. And should that happen before/after/concurrent with this change to clang?

Tough question...

  1. The biggest problem I see with canonicalization of binary FNeg to unary FNeg is if DAZ/FTZ are set. With those set, a binary FNeg could be used to zero insignificant data. Otherwise, with a unary FNeg, the sign bit would just be flipped, regardless if it's a denormal or not. We have a weather code that uses an explicit binary FNeg to sanitize their noisy input, so any canonicalization there would disturb the results.

    Currently, if I'm not mistaken, Clang only enables DAZ/FTZ with -Ofast and -ffast-math, so no problem there. But, there's nothing stopping a user from compiling with -O0 and flipping the DAZ/FTZ bits themselves.

    Conversely, if would be easy to argue that setting DAZ/FTZ breaks IEEE-754 compliance, so it doesn't matter what we do from that point on. Although, this seems like a weird grey-area to me.
Mon, Sep 16, 9:11 AM · Restricted Project
spatel committed rG4d9d0f9cf532: [InstCombine] move tests for icmp+add; NFC (authored by spatel).
[InstCombine] move tests for icmp+add; NFC
Mon, Sep 16, 8:36 AM
spatel committed rL372004: [InstCombine] move tests for icmp+add; NFC.
[InstCombine] move tests for icmp+add; NFC
Mon, Sep 16, 8:36 AM
spatel committed rGf201b1c91875: [InstCombine] add/move tests for icmp with add operand; NFC (authored by spatel).
[InstCombine] add/move tests for icmp with add operand; NFC
Mon, Sep 16, 7:07 AM
spatel committed rL371988: [InstCombine] add/move tests for icmp with add operand; NFC.
[InstCombine] add/move tests for icmp with add operand; NFC
Mon, Sep 16, 7:03 AM
spatel committed rGc5cd80815666: [InstCombine] remove unneeded one-use checks for icmp fold (authored by spatel).
[InstCombine] remove unneeded one-use checks for icmp fold
Mon, Sep 16, 5:55 AM
spatel committed rL371981: [InstCombine] remove unneeded one-use checks for icmp fold.
[InstCombine] remove unneeded one-use checks for icmp fold
Mon, Sep 16, 5:52 AM
spatel committed rG14ce3fde046a: [InstCombine] add icmp tests with extra uses; NFC (authored by spatel).
[InstCombine] add icmp tests with extra uses; NFC
Mon, Sep 16, 5:20 AM
spatel committed rL371979: [InstCombine] add icmp tests with extra uses; NFC.
[InstCombine] add icmp tests with extra uses; NFC
Mon, Sep 16, 5:20 AM
spatel committed rG91c2cd0691d1: [InstCombine] fix comments to match code; NFC (authored by spatel).
[InstCombine] fix comments to match code; NFC
Mon, Sep 16, 5:14 AM
spatel committed rL371978: [InstCombine] fix comments to match code; NFC.
[InstCombine] fix comments to match code; NFC
Mon, Sep 16, 5:14 AM

Sun, Sep 15

spatel committed rG3daf168fa986: [InstCombine] remove unneeded one-use checks for icmp fold (authored by spatel).
[InstCombine] remove unneeded one-use checks for icmp fold
Sun, Sep 15, 1:57 PM
spatel committed rL371940: [InstCombine] remove unneeded one-use checks for icmp fold.
[InstCombine] remove unneeded one-use checks for icmp fold
Sun, Sep 15, 1:57 PM
spatel committed rGc77ad16f8e5f: [InstCombine] add icmp tests with extra uses; NFC (authored by spatel).
[InstCombine] add icmp tests with extra uses; NFC
Sun, Sep 15, 1:13 PM
spatel committed rL371939: [InstCombine] add icmp tests with extra uses; NFC.
[InstCombine] add icmp tests with extra uses; NFC
Sun, Sep 15, 1:12 PM
spatel added a comment to D61675: [WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator.

Ping.

I've been digging through this pass and it seems to be ok AFAICT. OptimizeInst(...) canonicalizes both unary and binary FNegs to -1.0*X, if they are fast and part of a special multiply tree. Other FNegs end up as leaf nodes, so no problem there.

Anyone aware of other situations I should look at?

Do we need to enhance EarlyCSE to see this equivalence:

define float @cse_fneg(float %x, i1 %cond) {
  %fneg_unary = fneg float %x
  %fneg_binary = fsub float -0.0, %x
  %r = select i1 %cond, float %fneg_unary, float %fneg_binary
  ret float %r
}

The binary fneg has the looser requirement for NaN propagation (IEEE-754 6.3: "this standard does not specify the sign bit of a NaN result [for math ops]"), but that's less important for optimization than knowing the 2 values are otherwise equivalent.

Sun, Sep 15, 7:22 AM · Restricted Project
spatel added a comment to D61675: [WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator.

Ping.

I've been digging through this pass and it seems to be ok AFAICT. OptimizeInst(...) canonicalizes both unary and binary FNegs to -1.0*X, if they are fast and part of a special multiply tree. Other FNegs end up as leaf nodes, so no problem there.

Anyone aware of other situations I should look at?

Sun, Sep 15, 7:15 AM · Restricted Project
spatel added a comment to D67564: [IR] allow fast-math-flags on phi of FP values.

WRT divergent phi FMF, if you wanted to be pessimistic you could do an intersection(and), doing an inclusion(or) might include too much. Either way behavior would change. The intersection already has other prior context in FMF.

Sun, Sep 15, 6:26 AM · Restricted Project
spatel committed rGb6a0faaa0c79: [SLP] limit vectorization of Constant subclasses (PR33958) (authored by spatel).
[SLP] limit vectorization of Constant subclasses (PR33958)
Sun, Sep 15, 6:07 AM
spatel committed rL371931: [SLP] limit vectorization of Constant subclasses (PR33958).
[SLP] limit vectorization of Constant subclasses (PR33958)
Sun, Sep 15, 6:07 AM
spatel closed D67362: [SLP] limit vectorization of Constant subclasses (PR33958).
Sun, Sep 15, 6:07 AM · Restricted Project

Fri, Sep 13

spatel accepted D67498: [InstSimplify] simplifyUnsignedRangeCheck(): handle few tautological cases (PR43251).

LGTM

Fri, Sep 13, 2:03 PM · Restricted Project
spatel updated the diff for D67362: [SLP] limit vectorization of Constant subclasses (PR33958).

Patch updated:
Rebased with test for constant expressions.

Fri, Sep 13, 12:03 PM · Restricted Project
spatel added a comment to D67362: [SLP] limit vectorization of Constant subclasses (PR33958).

Can we easily get a constexpr test?

Fri, Sep 13, 12:02 PM · Restricted Project
spatel committed rG4ba6717c7e56: [SLP] add test for vectorization of constant expressions; NFC (authored by spatel).
[SLP] add test for vectorization of constant expressions; NFC
Fri, Sep 13, 11:34 AM
spatel committed rL371879: [SLP] add test for vectorization of constant expressions; NFC.
[SLP] add test for vectorization of constant expressions; NFC
Fri, Sep 13, 11:33 AM
spatel created D67564: [IR] allow fast-math-flags on phi of FP values.
Fri, Sep 13, 10:40 AM · Restricted Project

Thu, Sep 12

spatel added inline comments to D67498: [InstSimplify] simplifyUnsignedRangeCheck(): handle few tautological cases (PR43251).
Thu, Sep 12, 12:22 PM · Restricted Project
spatel accepted D67412: [InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251).

LGTM

Thu, Sep 12, 11:54 AM · Restricted Project
spatel committed rG458c2759b184: [InstCombine] add tests for fptrunc; NFC (authored by spatel).
[InstCombine] add tests for fptrunc; NFC
Thu, Sep 12, 11:00 AM
spatel committed rL371750: [InstCombine] add tests for fptrunc; NFC.
[InstCombine] add tests for fptrunc; NFC
Thu, Sep 12, 10:58 AM
spatel committed rG62ad62fb98ec: [InstCombine] reduce test noise and regenerate CHECK lines; NFC (authored by spatel).
[InstCombine] reduce test noise and regenerate CHECK lines; NFC
Thu, Sep 12, 10:11 AM
spatel committed rL371746: [InstCombine] reduce test noise and regenerate CHECK lines; NFC.
[InstCombine] reduce test noise and regenerate CHECK lines; NFC
Thu, Sep 12, 10:11 AM
spatel added inline comments to D67502: [InstSimplify] simplifyUnsignedRangeCheck(): '(a+b) </>= c &&/|| (a+b) ==/!= 0' if we known 'c' is 'a' or 'b' and is non-zero (PR43259).
Thu, Sep 12, 10:02 AM · Restricted Project
spatel added inline comments to D67502: [InstSimplify] simplifyUnsignedRangeCheck(): '(a+b) </>= c &&/|| (a+b) ==/!= 0' if we known 'c' is 'a' or 'b' and is non-zero (PR43259).
Thu, Sep 12, 8:33 AM · Restricted Project
spatel committed rG3f5a80836503: [ConstProp] allow folding for fma that produces NaN (authored by spatel).
[ConstProp] allow folding for fma that produces NaN
Thu, Sep 12, 7:14 AM
spatel committed rL371735: [ConstProp] allow folding for fma that produces NaN.
[ConstProp] allow folding for fma that produces NaN
Thu, Sep 12, 7:13 AM
spatel closed D67446: [ConstProp] allow folding for fma that produces NaN.
Thu, Sep 12, 7:13 AM · Restricted Project
spatel added inline comments to D67446: [ConstProp] allow folding for fma that produces NaN.
Thu, Sep 12, 6:12 AM · Restricted Project

Wed, Sep 11

spatel committed rG2bfb955c51fd: [InstCombine] rename variable for readability; NFC (authored by spatel).
[InstCombine] rename variable for readability; NFC
Wed, Sep 11, 3:34 PM
spatel committed rL371682: [InstCombine] rename variable for readability; NFC.
[InstCombine] rename variable for readability; NFC
Wed, Sep 11, 3:30 PM
spatel accepted D67259: [X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and later Intel CPUs..

LGTM

Wed, Sep 11, 12:58 PM · Restricted Project
spatel accepted D67411: [InstSimplify] simplifyUnsignedRangeCheck(): handle more cases (PR43251).

LGTM

Wed, Sep 11, 12:54 PM · Restricted Project
spatel updated the diff for D67446: [ConstProp] allow folding for fma that produces NaN.

Patch updated:
Added clarifying comment to APFloat header.

Wed, Sep 11, 10:17 AM · Restricted Project
spatel added inline comments to D67446: [ConstProp] allow folding for fma that produces NaN.
Wed, Sep 11, 10:05 AM · Restricted Project
spatel added inline comments to D67446: [ConstProp] allow folding for fma that produces NaN.
Wed, Sep 11, 9:52 AM · Restricted Project
spatel added inline comments to D67411: [InstSimplify] simplifyUnsignedRangeCheck(): handle more cases (PR43251).
Wed, Sep 11, 8:25 AM · Restricted Project
spatel added inline comments to D67411: [InstSimplify] simplifyUnsignedRangeCheck(): handle more cases (PR43251).
Wed, Sep 11, 8:11 AM · Restricted Project
spatel added a reviewer for D67446: [ConstProp] allow folding for fma that produces NaN: cameron.mcinally.
Wed, Sep 11, 7:32 AM · Restricted Project
spatel created D67446: [ConstProp] allow folding for fma that produces NaN.
Wed, Sep 11, 7:31 AM · Restricted Project
spatel committed rGede0905c1fb2: [ConstProp] add tests for fma that produce NaN; NFC (authored by spatel).
[ConstProp] add tests for fma that produce NaN; NFC
Wed, Sep 11, 7:20 AM
spatel committed rL371621: [ConstProp] add tests for fma that produce NaN; NFC.
[ConstProp] add tests for fma that produce NaN; NFC
Wed, Sep 11, 7:20 AM
spatel committed rG9c4047f26724: [ConstProp] move test file from InstSimplify; NFC (authored by spatel).
[ConstProp] move test file from InstSimplify; NFC
Wed, Sep 11, 7:02 AM
spatel committed rL371619: [ConstProp] move test file from InstSimplify; NFC.
[ConstProp] move test file from InstSimplify; NFC
Wed, Sep 11, 7:02 AM
spatel committed rG29ba5e0817ab: [InstSimplify] regenerate test CHECKs; NFC (authored by spatel).
[InstSimplify] regenerate test CHECKs; NFC
Wed, Sep 11, 6:57 AM
spatel committed rL371617: [InstSimplify] regenerate test CHECKs; NFC.
[InstSimplify] regenerate test CHECKs; NFC
Wed, Sep 11, 6:54 AM
spatel committed rG3183466aa604: [LangRef] add link for fma intrinsic (authored by spatel).
[LangRef] add link for fma intrinsic
Wed, Sep 11, 6:25 AM
spatel committed rL371615: [LangRef] add link for fma intrinsic.
[LangRef] add link for fma intrinsic
Wed, Sep 11, 6:24 AM
spatel accepted D66050: Improve division estimation of floating points..

LGTM

Wed, Sep 11, 5:38 AM · Restricted Project
spatel committed rGb3b2064c5180: [LangRef] fix punctuation; NFC (authored by spatel).
[LangRef] fix punctuation; NFC
Wed, Sep 11, 5:22 AM
spatel committed rL371612: [LangRef] fix punctuation; NFC.
[LangRef] fix punctuation; NFC
Wed, Sep 11, 5:21 AM
spatel committed rG80bea345d119: [InstCombine] fold sign-bit compares of srem (authored by spatel).
[InstCombine] fold sign-bit compares of srem
Wed, Sep 11, 5:04 AM
spatel committed rL371610: [InstCombine] fold sign-bit compares of srem.
[InstCombine] fold sign-bit compares of srem
Wed, Sep 11, 5:03 AM