Page MenuHomePhabricator

lebedev.ri (Roman Lebedev)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 27 2012, 6:35 AM (412 w, 6 d)

Recent Activity

Yesterday

lebedev.ri added a comment to D86233: [LangRef] Define mustprogress attribute.

No further comments from me, but i strongly like inline comment reword.

Fri, Sep 25, 9:51 AM · Restricted Project
lebedev.ri added a comment to D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.

Does that sound reasonable?

Yes IMHO.

What are the next suggested steps?

It would be great to isolate and check the cases which regressed a bit.

Fri, Sep 25, 4:05 AM · Restricted Project, Restricted Project
lebedev.ri added a comment to D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.

@MaskRay, @dmgreen & @sanwou01 thank you for running perf experiment!

Fri, Sep 25, 2:08 AM · Restricted Project, Restricted Project
lebedev.ri requested review of D88283: [NFCI][IR] ConstantRangeTest: add basic scaffolding for next-gen precision/correctness testing.
Fri, Sep 25, 1:29 AM · Restricted Project

Thu, Sep 24

lebedev.ri added a comment to D88178: [ConstantRange] Make binaryNot() more precise.

Guess i'll just have to do it myself then.

I feel like I'm missing context here. Did I overlook review feedback somewhere?

Thu, Sep 24, 11:18 PM · Restricted Project
lebedev.ri committed rG9bcf7b1c7a13: [NFCI][IR] ConstantRangeTest: add basic scaffolding for next-gen… (authored by lebedev.ri).
[NFCI][IR] ConstantRangeTest: add basic scaffolding for next-gen…
Thu, Sep 24, 2:37 PM
lebedev.ri committed rG31177949cb1d: [NFCI][IR] ConstantRangeTest: refactor operation range gatherers (authored by lebedev.ri).
[NFCI][IR] ConstantRangeTest: refactor operation range gatherers
Thu, Sep 24, 2:37 PM
lebedev.ri added a comment to D88178: [ConstantRange] Make binaryNot() more precise.

Rebase please

Thu, Sep 24, 2:37 PM · Restricted Project
lebedev.ri added a comment to D88178: [ConstantRange] Make binaryNot() more precise.

Guess i'll just have to do it myself then.

Thu, Sep 24, 2:35 PM · Restricted Project
lebedev.ri accepted D84176: [CMake][CTE] Add "check-clang-extra-..." targets to test only a particular Clang extra tool.

LGTM, please feel free to land this as-is.

Thu, Sep 24, 10:04 AM · Restricted Project, Restricted Project
lebedev.ri accepted D88225: [DAG] Fold vector mul(x,0)/mul(x,1) to a clearing mask.

Not sure about AArch64, but this LG to me.

Thu, Sep 24, 8:09 AM · Restricted Project
lebedev.ri added a comment to D36836: [clang-tidy] Implement sonarsource-function-cognitive-complexity check.

Rebased.

There is a number of official open-source LGPL-3 implementations already:

There are other open-source LGPL-3 implementations already:

There are other 3rd party implementations:

Quite honestly, i do not understand how did the license question arose.

It arose in a comment that I can't seem to get phab to show me the context for (which is a bit strange, I don't think I've run into that before): https://reviews.llvm.org/D36836#877636 Perhaps part of this was carrying discussion over from the IRC channel?

Would have it been fine if i based this on the open-source-licensed code?

I believe that would require legal analysis to answer.

Would have it not been? Would same license question be raised?

Likewise here (I suspect the answer would depend on what the license of the open source code is).

Somehow i don't think it would have been.

I don't wish to speculate about legal licensing issues on the mailing lists.

Is this really just about Copyright SonarSource S.A., 2018, Switzerland. All content is copyright protected. in https://www.sonarsource.com/docs/CognitiveComplexity.pdf ?
But that is only about the document, not the algorithm.
But even if we enternain the idea that all of the implementations must bow to that license,
then surely this is not the first code in LLVM that is implicitly/explicitly based on copyrighted doc.

This is rather frustrating.

I am sorry and I agree that it's frustrating.

Thu, Sep 24, 7:56 AM · Restricted Project, Restricted Project
lebedev.ri added a comment to D87832: [IndVars] Remove monotonic checks with unknown exit count.

This looks reasonable to me, but i'd prefer to defer to other reviewers.

Thu, Sep 24, 3:26 AM · Restricted Project

Wed, Sep 23

lebedev.ri updated the diff for D36836: [clang-tidy] Implement sonarsource-function-cognitive-complexity check.

There is a number of official open-source LGPL-3 implementations already:

Wed, Sep 23, 3:12 AM · Restricted Project, Restricted Project

Tue, Sep 22

lebedev.ri committed rGb289dc530632: [CVP] Narrow SDiv/SRem to the smallest power-of-2 that's sufficient to contain… (authored by lebedev.ri).
[CVP] Narrow SDiv/SRem to the smallest power-of-2 that's sufficient to contain…
Tue, Sep 22, 11:39 AM
lebedev.ri committed rGcb10d5d714e9: [NFC][CVP] Add tests for SDiv/SRem narrowing (authored by lebedev.ri).
[NFC][CVP] Add tests for SDiv/SRem narrowing
Tue, Sep 22, 11:38 AM
lebedev.ri committed rG4977eadee56f: [NFC][CVP] Give a better name STATISTIC() counting udiv i16 -> udiv i8 xforms (authored by lebedev.ri).
[NFC][CVP] Give a better name STATISTIC() counting udiv i16 -> udiv i8 xforms
Tue, Sep 22, 11:38 AM
lebedev.ri committed rG7465da2077c2: [ConstantRange] Introduce getMinSignedBits() method (authored by lebedev.ri).
[ConstantRange] Introduce getMinSignedBits() method
Tue, Sep 22, 11:38 AM
lebedev.ri committed rGb85395f30989: [NFC][APInt] Refactor getMinSignedBits() in terms of getNumSignBits() (authored by lebedev.ri).
[NFC][APInt] Refactor getMinSignedBits() in terms of getNumSignBits()
Tue, Sep 22, 11:38 AM
lebedev.ri committed rGba5afe5588de: [NFC][CVP] processUDivOrURem(): refactor to use ConstantRange::getActiveBits() (authored by lebedev.ri).
[NFC][CVP] processUDivOrURem(): refactor to use ConstantRange::getActiveBits()
Tue, Sep 22, 11:38 AM
lebedev.ri committed rG2ed9c4c70bbb: [ConstantRange] Introduce getActiveBits() method (authored by lebedev.ri).
[ConstantRange] Introduce getActiveBits() method
Tue, Sep 22, 11:38 AM
lebedev.ri committed rGb38d897e8026: [ConstantRange] binaryXor(): special-case binary complement case - the result… (authored by lebedev.ri).
[ConstantRange] binaryXor(): special-case binary complement case - the result…
Tue, Sep 22, 11:38 AM
lebedev.ri committed rG4eeeb356fc41: [CVP] Enhance SRem -> URem fold to work not just on non-negative operands (authored by lebedev.ri).
[CVP] Enhance SRem -> URem fold to work not just on non-negative operands
Tue, Sep 22, 11:38 AM
lebedev.ri committed rG36ea18b06430: [NFC][CVP] Add tests for srem with potentially different sigdness domains (authored by lebedev.ri).
[NFC][CVP] Add tests for srem with potentially different sigdness domains
Tue, Sep 22, 11:38 AM

Mon, Sep 21

lebedev.ri committed rG0ab99bb31420: [NFC][SCEV] Cleanup lowering of @llvm.uadd.sat, (-1 - V) is just ~V (authored by lebedev.ri).
[NFC][SCEV] Cleanup lowering of @llvm.uadd.sat, (-1 - V) is just ~V
Mon, Sep 21, 12:11 PM
lebedev.ri committed rG64e2cb7e9605: [SCEV] Recognize @llvm.uadd.sat as `%y + umin(%x, (-1 - %y))` (authored by lebedev.ri).
[SCEV] Recognize @llvm.uadd.sat as `%y + umin(%x, (-1 - %y))`
Mon, Sep 21, 10:27 AM
lebedev.ri committed rGfedc9549d50d: [SCEV] Recognize @llvm.usub.sat as `%x - (umin %x, %y)` (authored by lebedev.ri).
[SCEV] Recognize @llvm.usub.sat as `%x - (umin %x, %y)`
Mon, Sep 21, 10:27 AM
lebedev.ri committed rG0592de550f5c: [NFC][SCEV] Add tests for @llvm.*.sat intrinsics (authored by lebedev.ri).
[NFC][SCEV] Add tests for @llvm.*.sat intrinsics
Mon, Sep 21, 10:27 AM
lebedev.ri committed rG1bb7ab8c4a32: [SCEV] Recognize @llvm.abs as smax(x, -x) (authored by lebedev.ri).
[SCEV] Recognize @llvm.abs as smax(x, -x)
Mon, Sep 21, 10:26 AM
lebedev.ri committed rG83c2d10d3cae: [NFC][SCEV] Add tests for @llvm.abs intrinsic (authored by lebedev.ri).
[NFC][SCEV] Add tests for @llvm.abs intrinsic
Mon, Sep 21, 10:26 AM
lebedev.ri accepted D88015: [SCEV] Support unsigned predicates in isKnownPredicateViaNoOverflow.

What about swapped cases, like (X + C)<nuw> u<= X?

Mon, Sep 21, 6:41 AM · Restricted Project
lebedev.ri added a comment to D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.

I assume this makes 1f4e7463b5e3ff654c84371527767830e51db10d redundant?

Mon, Sep 21, 5:55 AM · Restricted Project, Restricted Project
lebedev.ri resigned from D71199: [clang-tidy] New check cppcoreguidelines-prefer-member-initializer.

Prerequeisite patch is committed, the check is tested now on the LLVM Project. @lebedev.ri, @aaron.ballman can I recommit it?

Thank you! SGTM.

Mon, Sep 21, 4:33 AM · Restricted Project, Restricted Project
lebedev.ri added a comment to D88015: [SCEV] Support unsigned predicates in isKnownPredicateViaNoOverflow.

What about swapped cases, like (X + C)<nuw> u<= X?

Mon, Sep 21, 4:32 AM · Restricted Project
lebedev.ri added a comment to D68414: [SROA] Enhance AggLoadStoreRewriter to rewrite integer load/store if it covers multi fields in original aggregate.

compile-time results: https://llvm-compile-time-tracker.com/compare.php?from=e616a4259889b55ed1bf5bf095f0e59658c6e311&to=0a2a92f815130c9ed2f0fed11850079bbd55038e&stat=instructions

Mon, Sep 21, 2:52 AM · Restricted Project
lebedev.ri added a comment to D87344: [IndVars] Remove exiting conditions that are trivially true/false.

This looks about reasonable to me, but i don't feel confident reviewing this.

Mon, Sep 21, 2:43 AM · Restricted Project

Sun, Sep 20

lebedev.ri added a comment to D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.

I have tested this patch internally and seen gains and losses. On one document search related benchmark 3~5% improvement. One zippy (snappy) there is 3~5% regression. Perhaps we do need a conditional extra SROA run.

Snappy - you mean public https://github.com/google/snappy?

Well, it should be possible to analyze it...

Sun, Sep 20, 10:54 AM · Restricted Project, Restricted Project
lebedev.ri added a comment to D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.

(I'm guessing that we are talking about run-time performance here.)

Sun, Sep 20, 9:57 AM · Restricted Project, Restricted Project
lebedev.ri accepted D87965: [InstCombine] replace phi values from unreachable blocks with 'undef'.

lg

Sun, Sep 20, 6:02 AM · Restricted Project
lebedev.ri updated the summary of D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.
Sun, Sep 20, 4:01 AM · Restricted Project, Restricted Project
lebedev.ri updated the summary of D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.
Sun, Sep 20, 3:56 AM · Restricted Project, Restricted Project

Sat, Sep 19

lebedev.ri updated the diff for D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.

Fixing a few clang tests and updating one more llvm test to check this also.

Sat, Sep 19, 12:37 PM · Restricted Project, Restricted Project
lebedev.ri added a comment to D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.
Sat, Sep 19, 11:59 AM · Restricted Project, Restricted Project
lebedev.ri updated the summary of D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.
Sat, Sep 19, 11:58 AM · Restricted Project, Restricted Project
lebedev.ri updated the summary of D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.
Sat, Sep 19, 11:54 AM · Restricted Project, Restricted Project
lebedev.ri added reviewers for D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling: RKSimon, craig.topper.
Sat, Sep 19, 11:53 AM · Restricted Project, Restricted Project
lebedev.ri added a comment to D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.

Looks like it does, yes.

Sat, Sep 19, 11:52 AM · Restricted Project, Restricted Project
lebedev.ri requested review of D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling.
Sat, Sep 19, 11:25 AM · Restricted Project, Restricted Project
lebedev.ri committed rGbb6f4d32aac3: [NFC][PhaseOrdering] Add test showing SROA not being performed after loop… (authored by lebedev.ri).
[NFC][PhaseOrdering] Add test showing SROA not being performed after loop…
Sat, Sep 19, 11:19 AM

Fri, Sep 18

lebedev.ri added inline comments to D87340: [EarlyCSE] Handle masked loads and stores.
Fri, Sep 18, 12:16 PM · Restricted Project
lebedev.ri added a comment to D87149: [InstCombine] erase instructions leading up to unreachable.

Hi!

We've run into a case where instcombine crashes with this patch:

opt -o /dev/null bbi-47401.ll -instcombine

on

@c = external global i16*, align 1

define void @main() {
entry:
  br label %for.cond

for.cond:                                         ; preds = %g.exit, %entry
  %conv1 = phi double [ %conv, %g.exit ], [ undef, %entry ]
  br i1 undef, label %for.end, label %for.body

for.body:                                         ; preds = %for.cond
  %0 = load i16*, i16** @c, align 1
  %1 = load i16, i16* %0, align 1
  %conv = sitofp i16 %1 to double
  unreachable

g.exit:                                           ; No predecessors!
  br label %for.cond

for.end:                                          ; preds = %for.cond
  %conv1.lcssa = phi double [ %conv1, %for.cond ]
  store double %conv1.lcssa, double* undef, align 1
  ret void
}

results in

opt: ../lib/Transforms/InstCombine/InstCombineInternal.h:448: virtual llvm::Instruction *llvm::InstCombinerImpl::eraseInstFromFunction(llvm::Instruction &): Assertion `I.use_empty() && "Cannot erase instruction that is used!"' failed.

I suppose it's the dead block %g.exit that messes up things.

We had a use_empty() guard in an earlier draft of this patch and decided that could not happen. :)
https://reviews.llvm.org/D87149#2259207
Thanks for the test case!

Fri, Sep 18, 10:07 AM · Restricted Project
lebedev.ri added inline comments to D85626: [tools][remarks-shlib] Don't build libRemarks.so without PIC.
Fri, Sep 18, 7:14 AM · Restricted Project
lebedev.ri added a comment to D87841: [openmp][libomptarget] Include header from LLVM source tree.

As openmp and llvm are now developed in the same monorepo, the only uses this
dependency can break are those that download the monorepo, then delete the llvm
directory before calling cmake. That is not a compelling use case.

That is factually not accurate.
There are standalone tarballs, including one for omp:
https://releases.llvm.org/download.html#10.0.1 OpenMP Source code (.sig)

Right. You can download the source for the sub-projects.
Though, others, e.g., clang, are already not compileable in isolation.

I see three options here:

  1. Continue to copy code from llvm-core to openmp and back. This is "good" for people that compile libomp in isolation and "bad" for developers. Arguably, it is also "bad" for people that use libomp with clang because there is an increased risk we have an unnoticed mismatch between the two. This is not theoretical but already happened.
  2. Copy the code from llvm-core to openmp during packaging time. This is a (small?) burden on the package maintainers but has all the advantages for now. However, it will get more complicated the more llvm-core features we want to use, e.g., tablegen.
  3. Require users to download llvm-core and point to it. This is a small burden on the users but has all the advantages now and in the future.

This patch, on its own, picks 3). I'm happy with that.

  1. At CMake time, check whether llvm-core is there; if yes use OMPGridValues.h from llvm-core, if no then download the header from some suitable location automatically. This eliminates the need for (2) and makes both proponents of (1) and proponents of (3) happy.
Fri, Sep 18, 4:25 AM · Restricted Project, Restricted Project

Thu, Sep 17

lebedev.ri added a comment to D87841: [openmp][libomptarget] Include header from LLVM source tree.

As openmp and llvm are now developed in the same monorepo, the only uses this
dependency can break are those that download the monorepo, then delete the llvm
directory before calling cmake. That is not a compelling use case.

Thu, Sep 17, 10:28 AM · Restricted Project, Restricted Project
lebedev.ri added inline comments to D68488: [PATCH 03/27] [noalias] [IR] Introduce ptr_provenance for LoadInst/StoreInst.
Thu, Sep 17, 8:48 AM · Restricted Project
lebedev.ri accepted D87825: [ASTMatchers] Define clang::ast_matchers::decompositionDecl.

SGTM, that's a pretty traditional issue.

Thu, Sep 17, 4:23 AM · Restricted Project
lebedev.ri committed rGaadf55d1cea2: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32… (authored by lebedev.ri).
[NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32…
Thu, Sep 17, 1:29 AM
lebedev.ri closed D87408: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%).
Thu, Sep 17, 1:29 AM · Restricted Project
lebedev.ri added a comment to D68488: [PATCH 03/27] [noalias] [IR] Introduce ptr_provenance for LoadInst/StoreInst.

I have 3 variants of this patch:

  • variant 0 : always reserve place for the ptr_provenance; Introduce NumOfOperandsDelta to get the correct number of operands
    • extra space overhead for load/store instructions;
    • extra instruction overhead for ALL instructions ('getNumOperand()');
    • low overhead when adding/removing ptr_provenance
    • NOTE: this is the current provided patch
  • variant 1 : always reserve place for the ptr_provenance; Move arguments around when adding/removing ptr_provenance
    • extra space overhead for load/store instructions;
    • NO impact on other instructions;
    • slightly more overhead when accessing operands of load/store instructions;
    • medium overhead when adding/removing ptr_provenance
  • variant 2 : Have a LoadInst, LoadWithProvInst and StoreInst, StoreWithProvInst. Only the 'WithProv' variant reserves place for optional argument. When the ptr_provenance is added, it can involve creating a Load/StoreWithProv based on the original Load/Store. Adding/removing the ptr_provenance argument involves moving arguments around.
    • extra space only needed when we know we will need it.
    • NO impact on other instructions;
    • slightly more overhead when accessing operands of load/store instructions;
    • medium to high overhead when adding ptr_provenance. medium when removing it.
    • higher risk in keeping things correct in optimization: adding the ptr_provenance can require recreating the Load/Store instruction. This might be difficult if the optimization pass is tracking pointers to the original Load/Store instruction.
    • main benefit: almost no impact from this change if the feature is not used (aka -fno-full-restrict)

Any preference in the way forward ? variant 1 is the easy one, as it is just a drop-in. Variant 2 is at the moment less-stable as it requires changes in all optimization passes that need to track the ptr_provenance.

Thu, Sep 17, 1:13 AM · Restricted Project
lebedev.ri added a comment to D87408: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%).

LGTM

Thu, Sep 17, 1:07 AM · Restricted Project

Tue, Sep 15

lebedev.ri added inline comments to D56387: [DAGCombiner] Enable SimplifyDemandedBits vector support for TRUNCATE (WIP).
Tue, Sep 15, 10:52 AM · Restricted Project
lebedev.ri added a comment to D86395: [InstCombine] transform pattern "(~A & B) ^ A -> (A | B)" added.

test53 and test55 has been removed.

We are going in circles. I'll give this 1 more try, and if it still doesn't make sense, then maybe another reviewer can better communicate what we expect for the tests:

  1. There are 4 commutative patterns variations in the code.
  2. Each one of those patterns should have a test (we should have at least 4 tests).
  3. Each test corresponds to exactly one of the commutative patterns.
  4. Each test should be in canonical form without this patch (the baseline CHECK lines should correspond exactly to the IR as written).
  5. The 8 step check I provided earlier may be used to confirm that the tests provide the expected coverage for the code.

@spatel, apologies for the confusion created due to multiple revisions.

Let's consider the initial pattern and associated commutative version of this.

initial pattern:-
(~A & B) ^ A -> (A | B) 

And 8 possible commutative version of this.
1) (~A & B) ^ A -> (A | B) 
2) (~B & A) ^ B -> (A | B)
3) (B & ~A) ^ A -> (A | B)  --> identical to 1)
4) A ^ (~A & B) -> (A | B)  --> identical to 1)
5) A ^ (B & ~A) -> (A | B)  --> identical to 1)
6) (A & ~B) ^ B -> (A | B)  --> identical to 2)
7) B ^ (~B & A) -> (A | B)  --> identical to 2)
8) B ^ (A & ~B) -> (A | B)  --> identical to 2)

As you may notice that all these patterns (IR) are reducing to 2 unique patterns (pattern 1 and pattern 2).
So pattern 1 and 2 are sufficient for the coverage of all commutative variations of the primary pattern.

How so?
If you only have a test for pattern 1 and pattern 2, how do you know that the commutative variants are handled?

Does this matches with you understandings ??

@lebedev.ri , I have manually tested and verified.

Tue, Sep 15, 10:22 AM · Restricted Project
lebedev.ri added a comment to D86395: [InstCombine] transform pattern "(~A & B) ^ A -> (A | B)" added.

test53 and test55 has been removed.

We are going in circles. I'll give this 1 more try, and if it still doesn't make sense, then maybe another reviewer can better communicate what we expect for the tests:

  1. There are 4 commutative patterns variations in the code.
  2. Each one of those patterns should have a test (we should have at least 4 tests).
  3. Each test corresponds to exactly one of the commutative patterns.
  4. Each test should be in canonical form without this patch (the baseline CHECK lines should correspond exactly to the IR as written).
  5. The 8 step check I provided earlier may be used to confirm that the tests provide the expected coverage for the code.

@spatel, apologies for the confusion created due to multiple revisions.

Let's consider the initial pattern and associated commutative version of this.

initial pattern:-
(~A & B) ^ A -> (A | B) 

And 8 possible commutative version of this.
1) (~A & B) ^ A -> (A | B) 
2) (~B & A) ^ B -> (A | B)
3) (B & ~A) ^ A -> (A | B)  --> identical to 1)
4) A ^ (~A & B) -> (A | B)  --> identical to 1)
5) A ^ (B & ~A) -> (A | B)  --> identical to 1)
6) (A & ~B) ^ B -> (A | B)  --> identical to 2)
7) B ^ (~B & A) -> (A | B)  --> identical to 2)
8) B ^ (A & ~B) -> (A | B)  --> identical to 2)

As you may notice that all these patterns (IR) are reducing to 2 unique patterns (pattern 1 and pattern 2).
So pattern 1 and 2 are sufficient for the coverage of all commutative variations of the primary pattern.

How so?
If you only have a test for pattern 1 and pattern 2, how do you know that the commutative variants are handled?

Tue, Sep 15, 9:17 AM · Restricted Project
lebedev.ri accepted D87685: [Scalarizer] Avoid changing name of non-instructions.

SGTM, sorry about that.

Tue, Sep 15, 4:56 AM · Restricted Project

Sun, Sep 13

lebedev.ri added a comment to D86890: [compiler-rt] Remove copy of InstrProfData.inc.

+1

Sun, Sep 13, 2:14 PM · Restricted Project, Restricted Project
lebedev.ri updated the diff for D87408: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%).

The point of the cutoff wouldn't be to handle anything in the LLVM testsuite; it would be to prevent it from blowing up on someone's giant machine-generated function which isn't represented in the testuite.

Sun, Sep 13, 2:11 PM · Restricted Project
lebedev.ri added a comment to D87408: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%).
Sun, Sep 13, 2:10 PM · Restricted Project
lebedev.ri retitled D87408: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%) from [NFC] EliminateDuplicatePHINodes(): drop DenseMap-driven CSE in favor of quadratic algorithmn to [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%).
Sun, Sep 13, 2:09 PM · Restricted Project
lebedev.ri added inline comments to D87555: [DivRemPairs] Add an initial case for hoisting to a common predecessor..
Sun, Sep 13, 1:16 AM · Restricted Project

Sat, Sep 12

lebedev.ri added a comment to D87555: [DivRemPairs] Add an initial case for hoisting to a common predecessor..

Can we use DominatorTree::findNearestCommonDominator()?
If not, please add a negative test, where it's unsafe to do so.

The nearest common dominator could put the division on a path that it wasn't going to execute on right? Isn't that not just possibly unsafe but also possibly unprofitable?

Sat, Sep 12, 12:42 AM · Restricted Project

Fri, Sep 11

lebedev.ri added a comment to D87555: [DivRemPairs] Add an initial case for hoisting to a common predecessor..

Can we use DominatorTree::findNearestCommonDominator()?
If not, please add a negative test, where it's unsafe to do so.

Fri, Sep 11, 11:40 PM · Restricted Project
lebedev.ri added a comment to D87479: [InstCombine] Don't sink the fdiv from (fmul (fdiv 1.0, %x), %y) if the fdiv isn't in the same basic block as the fmul.

I personally really think this should be viewed as phase ordering issue, that should be addressed by adding/moving passes in pipeline.

Fri, Sep 11, 10:54 AM · Restricted Project
lebedev.ri added a comment to D87527: [ASTMatchers] Fix `hasBody` for the descendants of `FunctionDecl`.

Thank you for looking into it.

Fri, Sep 11, 10:52 AM · Restricted Project
lebedev.ri added a comment to D71199: [clang-tidy] New check cppcoreguidelines-prefer-member-initializer.

Thank you for looking into it.

Fri, Sep 11, 10:52 AM · Restricted Project, Restricted Project

Thu, Sep 10

lebedev.ri added a comment to D87479: [InstCombine] Don't sink the fdiv from (fmul (fdiv 1.0, %x), %y) if the fdiv isn't in the same basic block as the fmul.

When does that happen?
-O3 seems to be fine: https://godbolt.org/z/1K3r56
Do we have InstCombine invocations after last LICM?

We found it in our internal code base so maybe there’s something different about our pipeline.

Thu, Sep 10, 2:09 PM · Restricted Project
lebedev.ri added a comment to D87479: [InstCombine] Don't sink the fdiv from (fmul (fdiv 1.0, %x), %y) if the fdiv isn't in the same basic block as the fmul.

When does that happen?
-O3 seems to be fine: https://godbolt.org/z/1K3r56
Do we have InstCombine invocations after last LICM?

Thu, Sep 10, 12:57 PM · Restricted Project
lebedev.ri accepted D87435: [PGO] Skip if an IndirectBrInst critical edge cannot be split.

SGTM, thanks

Thu, Sep 10, 9:50 AM · Restricted Project
lebedev.ri added a comment to D87399: Revert "[InstCombine] erase instructions leading up to unreachable".

Abandoned this one after @nikic applied https://github.com/llvm/llvm-project/commit/4e413e16216d0c94ada2171f3c59e0a85f4fa4b6
Thank you for everyone's help.

Thu, Sep 10, 9:35 AM · Restricted Project
lebedev.ri retitled D87455: [clang-tidy] performance-unnecessary-copy-initialization: Restrict UnnecessaryCopyInitialization check to variables initialized from free functions without arguments from Restrict UnnecessaryCopyInitialization check to variables initialized from methods/functions without arguments to [clang-tidy] performance-unnecessary-copy-initialization: Restrict UnnecessaryCopyInitialization check to variables initialized from methods/functions without arguments .
Thu, Sep 10, 8:00 AM · Restricted Project, Restricted Project
lebedev.ri added a comment to D71199: [clang-tidy] New check cppcoreguidelines-prefer-member-initializer.

So i've just reverted this in rGebf496d805521b53022a351f35854de977fee844.

@aaron.ballman @baloghadamsoftware how's QC going on nowadays here?
Was this evaluated on anything other than it's tests?

Surely. After I commit a patch, lots of buildbots verify it. They passed so far.

@baloghadamsoftware, i think you understand that wasn't the question.

Thu, Sep 10, 7:30 AM · Restricted Project, Restricted Project
lebedev.ri added a comment to D87451: add new clang option -mno-xcoff-visibility.

Bad commit message - links will go dead eventually, all the relevant stuff needs to be specified in the commit message itself.

Thu, Sep 10, 7:20 AM · Restricted Project, Restricted Project
lebedev.ri requested changes to D71199: [clang-tidy] New check cppcoreguidelines-prefer-member-initializer.
Thu, Sep 10, 6:35 AM · Restricted Project, Restricted Project
lebedev.ri reopened D71199: [clang-tidy] New check cppcoreguidelines-prefer-member-initializer.

So i've just reverted this in rGebf496d805521b53022a351f35854de977fee844.

Thu, Sep 10, 6:34 AM · Restricted Project, Restricted Project
lebedev.ri added a reverting change for rGf5fd7486d6c0: [clang-tidy] New check readability-prefer-member-initializer: rGebf496d80552: Revert "[clang-tidy] New check readability-prefer-member-initializer".
Thu, Sep 10, 6:33 AM
lebedev.ri committed rGebf496d80552: Revert "[clang-tidy] New check readability-prefer-member-initializer" (authored by lebedev.ri).
Revert "[clang-tidy] New check readability-prefer-member-initializer"
Thu, Sep 10, 6:33 AM
lebedev.ri added a reverting change for D71199: [clang-tidy] New check cppcoreguidelines-prefer-member-initializer: rGebf496d80552: Revert "[clang-tidy] New check readability-prefer-member-initializer".
Thu, Sep 10, 6:32 AM · Restricted Project, Restricted Project
lebedev.ri added a comment to D87435: [PGO] Skip if an IndirectBrInst critical edge cannot be split.

I believe every other caller ensures that as a precondition.
Why is the existing approach incorrect, and PGOInstrumentation shouldn't be fixed itself?

How do other callers handle such a critical edge?

By doing this check before calling the function?

SplitCriticalEdge is a generic API where some cases are already skipped (return nullptr), see isEHPad and other return nullptr in the function.

Thu, Sep 10, 1:03 AM · Restricted Project
lebedev.ri added inline comments to D76434: [SCEV] Query expanded immediate cost at minsize.
Thu, Sep 10, 1:01 AM · Restricted Project
lebedev.ri requested changes to D86346: [SimplifyCFG] Accumulate cost against budget.

Please do address my review comments.

Thu, Sep 10, 12:46 AM · Restricted Project
lebedev.ri requested changes to D86347: [SimplifyCFG] Two entry phi select costs.

Hold on, how does this interact with D86346?

Thu, Sep 10, 12:43 AM · Restricted Project
lebedev.ri added a comment to D86347: [SimplifyCFG] Two entry phi select costs.

So we have two thresholds at play here?
That doesn't seem right to me..

Thu, Sep 10, 12:40 AM · Restricted Project
lebedev.ri added a comment to D76434: [SCEV] Query expanded immediate cost at minsize.

@samparker why did you commit this?
I have not finished reviewing this, and i've requested changes to the previous revisions.

Thu, Sep 10, 12:30 AM · Restricted Project
lebedev.ri added a comment to D87435: [PGO] Skip if an IndirectBrInst critical edge cannot be split.

I believe every other caller ensures that as a precondition.
Why is the existing approach incorrect, and PGOInstrumentation shouldn't be fixed itself?

How do other callers handle such a critical edge?

By doing this check before calling the function?

Thu, Sep 10, 12:23 AM · Restricted Project
lebedev.ri added a comment to D87435: [PGO] Skip if an IndirectBrInst critical edge cannot be split.

I believe every other caller ensures that as a precondition.
Why is the existing approach incorrect, and PGOInstrumentation shouldn't be fixed itself?

Thu, Sep 10, 12:03 AM · Restricted Project

Wed, Sep 9

lebedev.ri requested review of D87408: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%).
Wed, Sep 9, 11:52 PM · Restricted Project
lebedev.ri added a comment to D87408: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%).

The problem here is that this is O(n^2) in the number of phi nodes, rather than O(n log n). So this will be faster for average inputs, but potentially much slower for degenerate cases. That said, I can't say that I've encountered "block with ten thousand phi nodes" as a problem before. It might still make sense to limit this heuristically, e.g. by limiting the inner loop to at most 100 iterations (which may fail to CSE some phi nodes in degenerate cases, but will avoid quadratic blowup.)

Wed, Sep 9, 11:52 PM · Restricted Project
lebedev.ri updated the summary of D87408: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%).
Wed, Sep 9, 11:48 PM · Restricted Project
lebedev.ri planned changes to D87408: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%).

Will update tomorrow once http://llvm-compile-time-tracker.com/ processes variants.

Wed, Sep 9, 2:43 PM · Restricted Project
lebedev.ri added a comment to D87408: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%).

The problem here is that this is O(n^2) in the number of phi nodes, rather than O(n log n). So this will be faster for average inputs, but potentially much slower for degenerate cases. That said, I can't say that I've encountered "block with ten thousand phi nodes" as a problem before.

Wed, Sep 9, 1:24 PM · Restricted Project
lebedev.ri updated the summary of D87408: [NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%).
Wed, Sep 9, 12:29 PM · Restricted Project