Page MenuHomePhabricator
Feed Advanced Search

Today

lebedev.ri accepted D80489: [TargetLowering] Improve expandFunnelShift shift amount masking.

LG
http://volta.cs.utah.edu:8080/z/Nod0Gr

Sun, May 24, 3:12 AM · Restricted Project
lebedev.ri updated the summary of D80489: [TargetLowering] Improve expandFunnelShift shift amount masking.
Sun, May 24, 3:12 AM · Restricted Project

Yesterday

lebedev.ri accepted D80466: [X86] Improve i8 + 'slow' i16 funnel shift codegen.
Sat, May 23, 12:11 PM · Restricted Project
lebedev.ri added a comment to D80466: [X86] Improve i8 + 'slow' i16 funnel shift codegen.

For fshl case, we could introduce some more ILP: http://volta.cs.utah.edu:8080/z/UJ6viM
https://godbolt.org/z/xsJgPb https://godbolt.org/z/5W26NV
Not sure it would be an improvement?
As a sidenote, we clearly don't fold to either variant in DAGCombiner.

Sat, May 23, 8:27 AM · Restricted Project
lebedev.ri added inline comments to D80466: [X86] Improve i8 + 'slow' i16 funnel shift codegen.
Sat, May 23, 4:12 AM · Restricted Project
lebedev.ri added a comment to D80344: [Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1.

It may be helpful (even for the reviewers) to first specify their behavior,
instead of writing that after-the-fact "backwardly" based on the implementation.

For reviewers, the purpose of those intrinsic are described in Summary section:

Like the disscussion we just had in HWLoops patch, unless the behavior of new intrinsics/etc is stated in langref, they are unspecified.

Sat, May 23, 3:40 AM · Restricted Project, Restricted Project

Fri, May 22

lebedev.ri resigned from D80344: [Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1.

This should likely be at least 3 patches: llvm middle-end, llvm codegen, clang.
Langref changes missing for new intrinsics.
Please post all patches with full context (-U99999)

I was thinking to update docs/ExceptionHandling.rst after this patch is accepted. Do you think this types of intrinsic should be described in Langref?
thanks,

Fri, May 22, 4:38 PM · Restricted Project, Restricted Project
lebedev.ri added a comment to D80344: [Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1.

This should likely be at least 3 patches: llvm middle-end, llvm codegen, clang.
Langref changes missing for new intrinsics.
Please post all patches with full context (-U99999)

Fri, May 22, 8:34 AM · Restricted Project, Restricted Project
lebedev.ri committed rGcd921accf91a: [NFC] InstCombineNegator: use auto where type is obvious from the cast (authored by lebedev.ri).
[NFC] InstCombineNegator: use auto where type is obvious from the cast
Fri, May 22, 1:50 AM

Thu, May 21

lebedev.ri committed rGb2df96123198: [IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835) (authored by lebedev.ri).
[IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835)
Thu, May 21, 3:13 AM
lebedev.ri closed D79787: [IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835).
Thu, May 21, 3:13 AM · Restricted Project
lebedev.ri added a comment to D79787: [IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835).

I don't have an actionable example at hand. I could just drop FIXME if it's too weird.

Then please add a debug printout that it happened.

I'm sorry, i'm not sure i follow. It's not a bug, it's a potential improvement.
There is no point in trying to detect such a situation right now
because it will bring no benefit but will cost us compile time.

Thu, May 21, 2:08 AM · Restricted Project

Wed, May 20

lebedev.ri committed rG55430f53f397: [InstCombine] `insertelement` is negatible if both sources are negatible (authored by lebedev.ri).
[InstCombine] `insertelement` is negatible if both sources are negatible
Wed, May 20, 12:04 PM
lebedev.ri committed rGa6097cebe9cd: [NFC][InstCombine] Negator: tests for insertelement negation (authored by lebedev.ri).
[NFC][InstCombine] Negator: tests for insertelement negation
Wed, May 20, 12:04 PM
lebedev.ri committed rGebed96fdbf26: [InstCombine] Negator: `extractelement` is negatible if src is negatible (authored by lebedev.ri).
[InstCombine] Negator: `extractelement` is negatible if src is negatible
Wed, May 20, 12:04 PM
lebedev.ri committed rG952e7106b340: [NFC][InstCombine] Negator: tests for extractelement negation (authored by lebedev.ri).
[NFC][InstCombine] Negator: tests for extractelement negation
Wed, May 20, 12:04 PM
lebedev.ri added inline comments to D80276: [Alignment] Fix misaligned interleaved loads.
Wed, May 20, 10:22 AM · Restricted Project
lebedev.ri updated the diff for D79787: [IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835).

Attempted to address naming nits. Is this any better?

Wed, May 20, 7:34 AM · Restricted Project
lebedev.ri added a comment to D79787: [IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835).

Thank you for taking a look!

Wed, May 20, 1:35 AM · Restricted Project

Tue, May 19

lebedev.ri added a comment to D79100: [LV][TTI] Emit new IR intrinsic llvm.get.active.mask for tail-folded loops.

Ok, perhaps I got that wrong then. @samparker can correct me here perhaps, but as I said, for the hardware loop intrinsics I believe this was intentional. But anyway, as I said, will document this new one. As the hardware loop intrinsics are completely separate from this, I will do that separately.

Tue, May 19, 11:28 AM · Restricted Project
lebedev.ri added a comment to D79100: [LV][TTI] Emit new IR intrinsic llvm.get.active.mask for tail-folded loops.

Semantics are still unspecified. Before adding even more intrinsics,
i'd strongly suggest to specify at least the already-committed ones.
Because as far as i can tell, i don't see anything in langref for any of them.

This was intentional. With the already-committed ones you mean the hardware loops ones, and they are not meant to be user-facing intrinsics. That is, we don't expect user to play around with e.g. the hwloop.decrement intrinsic; at least these are really meant to be generated by the optimisers.
This new intrinsic here is slightly different, in that it probably is useful as a user facing intrinsic, so don't mind documenting it.

Tue, May 19, 10:22 AM · Restricted Project
lebedev.ri added a comment to D79369: [InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0".

Please feel free to add weekly-ish "ping" comments, i lost track of this, sorry :/
This looks good for the time being modulo nits.

Tue, May 19, 10:21 AM · Restricted Project
lebedev.ri requested changes to D79100: [LV][TTI] Emit new IR intrinsic llvm.get.active.mask for tail-folded loops.

Semantics are still unspecified. Before adding even more intrinsics,
i'd strongly suggest to specify at least the already-committed ones.
Because as far as i can tell, i don't see anything in langref for any of them.

Tue, May 19, 8:07 AM · Restricted Project
lebedev.ri requested review of D79787: [IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835).
Tue, May 19, 5:54 AM · Restricted Project
lebedev.ri added a comment to D79787: [IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835).

ping

Tue, May 19, 4:49 AM · Restricted Project

Sun, May 17

lebedev.ri committed rGfde8eb00e146: [InstCombine] visitMaskedMerge(): when unfolding, sanitize undef constants… (authored by lebedev.ri).
[InstCombine] visitMaskedMerge(): when unfolding, sanitize undef constants…
Sun, May 17, 1:19 PM

Sat, May 16

lebedev.ri added a comment to D80062: [x86] Propagate memory operands during call frame optimization.

Can there be test coverage for this?

Sat, May 16, 1:45 PM · Restricted Project
lebedev.ri added a comment to D79078: [VectorCombine] Leave reduction operation to SLP.

I also sympathize with trying to solve this here rather than SLP. One of the reasons vector-combine exists is because SLP became too hard to reason about. In hindsight, we should have created a separate pass for reductions - those are not traditional SLP concerns. Just my opinion. :)

I'm not sure what you have in mind here?
That *this* pass should also form such reductions?
Or that we should not disturb them after SLP formed them?
Or something else?

The reduction logic is a complicated blob of code, so I don't think it belongs here. I'd split it off from SLP into its own pass, but it looks like a lot of untangling.
Currently, we're running this pass *before* SLP only. We could move this after SLP to make sure we are not disturbing reductions before SLP has a chance to recognize them...but I'm not sure if that would also now cause regressions. I don't have a good feel for how these passes are interacting.

What does it take to cause the infinite looping that you found?

No, i mean in the case if we would be forming reductions here,
because i think we'd then have two conflicting transforms here,
and they would cause traditional instcombine/dagcombine-esque infinite combine loop.

Sat, May 16, 9:29 AM · Restricted Project

Fri, May 15

lebedev.ri added a comment to D79078: [VectorCombine] Leave reduction operation to SLP.

I'll defer to @spatel, although i semi-weakly insist that adding such cut-offs is weird.
OTOH if this pass is taught to form such reductions we would have caught this regression for free,
because after D79799 we would have ended up in an endless combine loop here.

Fri, May 15, 10:51 AM · Restricted Project
lebedev.ri added a comment to D79787: [IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835).

Thanks a lot for taking this over! Looks good from my side but you may wish to wait for a more expert review.

I also confirmed that this still fixes our issue on the 10.0 branch. I had to remove -scev-cheap-expansion-budget=1 since that didn't exist in 10, but fortunately it's not needed there. Since it wasn't a clean merge, I put the diff in P8220 if it makes your/Tom's life easier.

Fri, May 15, 10:51 AM · Restricted Project

Wed, May 13

lebedev.ri added a comment to D79830: Add support of __builtin_expect_with_probability.

Thanks for working on this.
Please upload patch with full context.

Hi, sorry but I'm not sure what does full context means, is that means I need to show more lines(for example 10 lines) when generating patch or show the whole function? Thank you!

Wed, May 13, 12:31 PM · Restricted Project, Restricted Project
lebedev.ri added a reviewer for D79830: Add support of __builtin_expect_with_probability: erichkeane.

Thanks for working on this.
Please upload patch with full context.

Wed, May 13, 1:34 AM · Restricted Project, Restricted Project

Tue, May 12

lebedev.ri updated the diff for D79787: [IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835).

Is it a problem that the test case doesn't fail on trunk before the patch, since SCEVMinMaxExpr is no longer automatically high cost? I noted this in D79720 but wasn't sure how to nudge the code around it.

Tue, May 12, 8:34 AM · Restricted Project
lebedev.ri created D79787: [IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835).
Tue, May 12, 7:29 AM · Restricted Project
lebedev.ri requested changes to D79720: [IndVarSimplify][LoopUtils] Track rewrite cost per unique BB (PR45835).

D79787

Tue, May 12, 7:29 AM · Restricted Project

Mon, May 11

lebedev.ri added a comment to D79720: [IndVarSimplify][LoopUtils] Track rewrite cost per unique BB (PR45835).

As discussed in IRC, there is a much more fundamental problem that needs to be solved.
I'll take a look.

Mon, May 11, 12:56 PM · Restricted Project
lebedev.ri added a comment to D79720: [IndVarSimplify][LoopUtils] Track rewrite cost per unique BB (PR45835).

Yes, but it is still not obvious to me as to why that happens?
It's the same PHI node, we are asking about the same value, for the same basic block.
Why do we not find expansion first time but do find it second time?
Did we perform some expansion inbetween?

If I'm understanding my recording correctly -- the reason the expansion was found the second time was that we expanded the value the first time, immediately after computing isHighCostExpansion: https://github.com/llvm/llvm-project/blob/release/10.x/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp#L675

Mon, May 11, 12:22 PM · Restricted Project
lebedev.ri added a comment to D79720: [IndVarSimplify][LoopUtils] Track rewrite cost per unique BB (PR45835).

Thank you for looking into it!

Mon, May 11, 9:39 AM · Restricted Project
lebedev.ri retitled D79720: [IndVarSimplify][LoopUtils] Track rewrite cost per unique BB (PR45835) from [IndVarSimplify][LoopUtils] Track rewrite cost per unique BB to [IndVarSimplify][LoopUtils] Track rewrite cost per unique BB (PR45835).
Mon, May 11, 9:39 AM · Restricted Project
lebedev.ri added a reviewer for D79720: [IndVarSimplify][LoopUtils] Track rewrite cost per unique BB (PR45835): mkazantsev.
Mon, May 11, 9:39 AM · Restricted Project

Fri, May 8

lebedev.ri added inline comments to D78659: Add nomerge function attribute to supress tail merge optimization in simplifyCFG.
Fri, May 8, 12:20 PM · Restricted Project, Restricted Project

Wed, May 6

lebedev.ri accepted D79452: [VectorCombine] scalarize binop of inserted elements into vector constants.

Seems fairly uncontroversial to me.

Wed, May 6, 12:30 AM · Restricted Project
lebedev.ri added a comment to D78729: [Attributor] Merge the query set into AbstractAttribute.

Seems reasonable to me but i'm not really familiar with attributor internals..

Wed, May 6, 12:30 AM · Restricted Project
lebedev.ri requested changes to D79078: [VectorCombine] Leave reduction operation to SLP.
Wed, May 6, 12:30 AM · Restricted Project
lebedev.ri accepted D79171: [InstCombine] canonicalize bitcast after insertelement into undef.

LG in general, but MMX stuff puzzles me, so would be good for @craig.topper to comment.

Wed, May 6, 12:30 AM · Restricted Project

Tue, May 5

lebedev.ri added a comment to D79304: [DAG] Fold (mul(abs(x),abs(x))) -> (mul(x,x)) (PR39476).

Now that we handle it in instcombine, do we still want dagcombine fold?

Tue, May 5, 11:57 PM · Restricted Project
lebedev.ri requested changes to D72423: [DemandedBits] Improve accuracy of Add propagator.

as per my last comment

Tue, May 5, 11:57 PM · Restricted Project
lebedev.ri added a comment to D79234: [ValueTracking] Fix computeKnownBits() with bitwidth-changing ptrtoint.

Is this fixing some bug? Is there an observable behavior change?
Can a (currently-crashing) test be added for some pass?
Or can this only be triggered via a specially-crafted unit test?

Tue, May 5, 11:57 PM · Restricted Project
lebedev.ri added inline comments to D79369: [InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0".
Tue, May 5, 2:03 PM · Restricted Project
lebedev.ri accepted D79407: [InstCombine] Remove hasOneUse check for pow(C,x) -> exp2(log2(C)*x).

Seems reasonable.

Tue, May 5, 5:20 AM · Restricted Project

Mon, May 4

lebedev.ri added inline comments to D79369: [InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0".
Mon, May 4, 11:07 PM · Restricted Project
lebedev.ri added inline comments to D79369: [InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0".
Mon, May 4, 11:07 PM · Restricted Project
lebedev.ri added inline comments to D79003: [DAG] Add SimplifyDemandedVectorElts binop SimplifyMultipleUseDemandedBits handling.
Mon, May 4, 7:59 AM · Restricted Project
lebedev.ri added a comment to D79003: [DAG] Add SimplifyDemandedVectorElts binop SimplifyMultipleUseDemandedBits handling.

Seems reasonable to me in general.

Mon, May 4, 4:46 AM · Restricted Project
lebedev.ri added inline comments to D79321: [SLC] Allow llvm.pow(2**n,x) -> llvm.exp2(n*x) even if no exp2 lib func.
Mon, May 4, 4:46 AM · Restricted Project
lebedev.ri accepted D79319: [InstCombine] Fold (mul(abs(x),abs(x))) -> (mul(x,x)) (PR39476).

LG, thank you

Mon, May 4, 4:46 AM · Restricted Project
lebedev.ri accepted D68231: [SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func.

Patch description still needs an update but i think this looks okay.

Mon, May 4, 2:37 AM · Restricted Project

Sun, May 3

lebedev.ri added a comment to D79304: [DAG] Fold (mul(abs(x),abs(x))) -> (mul(x,x)) (PR39476).

Echoing what @craig.topper asked, do we somehow end up with that pattern after instcombine had a chance to do this fold?

Where did he ask that?

https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200427/776590.html

Sun, May 3, 12:45 PM · Restricted Project
lebedev.ri added inline comments to D79304: [DAG] Fold (mul(abs(x),abs(x))) -> (mul(x,x)) (PR39476).
Sun, May 3, 11:41 AM · Restricted Project
lebedev.ri added a comment to D79304: [DAG] Fold (mul(abs(x),abs(x))) -> (mul(x,x)) (PR39476).

Echoing what @craig.topper asked, do we somehow end up with that pattern after instcombine had a chance to do this fold?

Sun, May 3, 11:09 AM · Restricted Project

Sat, May 2

lebedev.ri resigned from D79298: [NFC] Traverse function using dominator tree..

I'm not really sure as to the future of this DSE given the work on Mem-SSA-driven DSE.

Sat, May 2, 11:18 PM · Restricted Project
lebedev.ri added inline comments to D68231: [SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func.
Sat, May 2, 1:33 AM · Restricted Project

Fri, May 1

lebedev.ri accepted D79041: [InstCombine] Fold or(zext(bswap(x)),shl(zext(bswap(y)),bw/2)) -> bswap(or(zext(x),shl(zext(y), bw/2)).

This looks good to me, but i'm not sure this will work more generally.
I.e., i will be surprized if the or tree reaches us in the proper form,
without reassociation into some other form we aren't expecting.
Especially for larger tree depths.

Fri, May 1, 11:57 PM · Restricted Project
lebedev.ri added a comment to D68231: [SLC] Allow llvm.pow(x,2.0) -> x*x etc even if no pow() lib func.

The patch description is only stating the previous status-quo as a fact,
but gives no reasoning as to why it should change, why it is legal/okay to change it.

Fri, May 1, 11:57 PM · Restricted Project
lebedev.ri added a comment to D74051: Move update_cc_test_checks.py tests to clang.

This broke running clang tests stand-alone:

Traceback (most recent call last):
  File "/var/tmp/portage/sys-devel/clang-11.0.0.9999/work/x/y/clang-abi_x86_64.amd64/bin/../../llvm/utils/lit/lit/TestingConfig.py", line 89, in load_from_path
    exec(compile(data, path, 'exec'), cfg_globals, None)
  File "/var/tmp/portage/sys-devel/clang-11.0.0.9999/work/x/y/clang/test/utils/update_cc_test_checks/lit.local.cfg", line 21, in <module>
    assert os.path.isfile(script_path)
AssertionError
                                                           
FAILED: test/CMakeFiles/check-clang

Obviously this fails when LLVM source tree is not available.

Is that a supported build mode for clang?

Fri, May 1, 11:57 PM · Restricted Project, Restricted Project
lebedev.ri added inline comments to D68414: [SROA] Enhance AggLoadStoreRewriter to rewrite integer load/store if it covers multi fields in original aggregate.
Fri, May 1, 12:35 AM · Restricted Project

Thu, Apr 30

lebedev.ri added a comment to D78861: [Attributor] Track AA dependency using dependency graph.

Thanks for working on this. I added a first set of comments below. We'll have to rebase this once the changes to reduce memory usage are all in. We will also need to verify this does not regress memory usage too much.

I'd like to note that unless i'm mistaken right now all this graph stuff is not actually being used for attributes, but only for printing the graph of attribute dependency.
Is there a plan to actually use the graph? If not, then the graph shouldn't be built unless there was a request to output it, i think.

Thu, Apr 30, 10:40 AM · Restricted Project
lebedev.ri added a comment to D79171: [InstCombine] canonicalize bitcast after insertelement into undef.

I messed with mmx in this area once and regretted it. See rG5ebbabc1af360756f402203ba7704bb480f279a7

Is canonicalizing towards a vector type that wasn't mentioned in the IR the right way to go? Is that cast free for legal types on all targets? I think its potentially scalarized or becomes a load/store to stack temporary for illegal types in the backend.

Thu, Apr 30, 10:40 AM · Restricted Project

Wed, Apr 29

lebedev.ri added a comment to D79078: [VectorCombine] Leave reduction operation to SLP.

This needs a phase-ordering test.

For now vector-combine is executed before slp in both legacy pm and new pm with O2, so either we handle it here or slp can handle this kind of pattern.

I'm not really sure what you mean.
We clearly have phase-ordering issue, and we should have a test that shows it.

Wed, Apr 29, 5:51 AM · Restricted Project
lebedev.ri added a comment to D79078: [VectorCombine] Leave reduction operation to SLP.

This needs a phase-ordering test.
Why shouldn't SLPVectorizer be taught about that pattern instead?

Wed, Apr 29, 3:43 AM · Restricted Project

Tue, Apr 28

lebedev.ri added a comment to D76886: [InlineFunction] Disable emission of alignment assumptions by default.

Don't we have a less cumbersome way to test for alignment?

That is the current canonical alignment assumption, at least until the attribute bundles are here.

Tue, Apr 28, 11:17 AM · Restricted Project
lebedev.ri committed rGa0004358a8e7: [InstCombine] Negator: 'or' with no common bits set is just 'add' (authored by lebedev.ri).
[InstCombine] Negator: 'or' with no common bits set is just 'add'
Tue, Apr 28, 9:39 AM
lebedev.ri committed rGa5f22f2b0ef2: [NFC][InstCombine] Tests for negation of 'or' with no common bits set (authored by lebedev.ri).
[NFC][InstCombine] Tests for negation of 'or' with no common bits set
Tue, Apr 28, 9:39 AM
lebedev.ri added a comment to D79002: [NFC][CostModel] Add TargetCostKind to relevant APIs.

Weak -0.01c
I think this might deserve an RFC on llvm-dev.

Tue, Apr 28, 7:28 AM · Restricted Project

Apr 24 2020

lebedev.ri added a comment to D78582: [InstCombine] substitute equivalent constant to reduce logic-of-icmps.

It looks like this caused a +0.40% text size increase on kimwitu++. There's a corresponding > 10% compile-time increase on the main.cc file, so that's likely the relevant one.

If you have time, it would be good to double check what's going on there, as text size increase usually means we're losing optimizations.

I'm going to expose my lack of git knowledge here, but how do we know this commit is to blame for the difference?

Better link: http://llvm-compile-time-tracker.com/compare.php?from=7003a1da37b2aae5b17460922efde9efde2c229d&to=62da6ecea298739ad59c0563ce6d9493804ef1f0&stat=size-text This one only contains this change in the diff range. (The previous one had an additional MLIR change, which can't make a difference.)

How do I determine the build flags that are needed to repro?

The flags are determined by test-suite cmake/caches/O3.cmake configuration. I get the flags by running ninja kc -v and sticking -emit-llvm on the right one, which would be for me:

/home/nikic/llvm-project/build/bin/clang++ -DNDEBUG -O3 -w -Werror=date-time -I/home/nikic/llvm-test-suite/MultiSource/Applications/kimwitu++ -DYYDEBUG=1 -MD -MT MultiSource/Applications/kimwitu++/CMakeFiles/kc.dir/util.cc.o -MF MultiSource/Applications/kimwitu++/CMakeFiles/kc.dir/main.cc.o.d -o MultiSource/Applications/kimwitu++/CMakeFiles/kc.dir/main.cc.o -c ../MultiSource/Applications/kimwitu++/main.cc -emit-llvm

I tried reverting this patch locally and compiling /testsuite/trunk/MultiSource/Applications/kimwitu++/main.cc, and the IR is identical to pre-revert compiled at -O2 or -O3.

Hm, are you sure you reverted the right one? I also didn't see a difference, until I realized I reverted the InstSimplify change instead of the InstCombine change, duh. After fixing that, I get these two files and diff: https://gist.github.com/nikic/ecbe2482367be25b17ab982c2346b661

I'm looking into this, and i believe those are improvements, not regressions.

Thanks! I double-checked my setup, and I'm still not able to repro. I wonder if platform makes a difference (I'm testing on macOS)?

Absolutely, more specifically libc++ (you) vs libstdc++ (me) makes difference.

Apr 24 2020, 11:53 AM · Restricted Project
lebedev.ri added a comment to D78582: [InstCombine] substitute equivalent constant to reduce logic-of-icmps.

It looks like this caused a +0.40% text size increase on kimwitu++. There's a corresponding > 10% compile-time increase on the main.cc file, so that's likely the relevant one.

If you have time, it would be good to double check what's going on there, as text size increase usually means we're losing optimizations.

I'm going to expose my lack of git knowledge here, but how do we know this commit is to blame for the difference?

Better link: http://llvm-compile-time-tracker.com/compare.php?from=7003a1da37b2aae5b17460922efde9efde2c229d&to=62da6ecea298739ad59c0563ce6d9493804ef1f0&stat=size-text This one only contains this change in the diff range. (The previous one had an additional MLIR change, which can't make a difference.)

How do I determine the build flags that are needed to repro?

The flags are determined by test-suite cmake/caches/O3.cmake configuration. I get the flags by running ninja kc -v and sticking -emit-llvm on the right one, which would be for me:

/home/nikic/llvm-project/build/bin/clang++ -DNDEBUG -O3 -w -Werror=date-time -I/home/nikic/llvm-test-suite/MultiSource/Applications/kimwitu++ -DYYDEBUG=1 -MD -MT MultiSource/Applications/kimwitu++/CMakeFiles/kc.dir/util.cc.o -MF MultiSource/Applications/kimwitu++/CMakeFiles/kc.dir/main.cc.o.d -o MultiSource/Applications/kimwitu++/CMakeFiles/kc.dir/main.cc.o -c ../MultiSource/Applications/kimwitu++/main.cc -emit-llvm

I tried reverting this patch locally and compiling /testsuite/trunk/MultiSource/Applications/kimwitu++/main.cc, and the IR is identical to pre-revert compiled at -O2 or -O3.

Hm, are you sure you reverted the right one? I also didn't see a difference, until I realized I reverted the InstSimplify change instead of the InstCombine change, duh. After fixing that, I get these two files and diff: https://gist.github.com/nikic/ecbe2482367be25b17ab982c2346b661

Apr 24 2020, 10:48 AM · Restricted Project
lebedev.ri added a comment to D78810: [InstCombine] Check max alignment before adding attr on aligned_alloc.

Thank you for looking into this.
This should instead clamp to the maximal alignment. (the 'is power of 2' check should remain)

Clamping would be incorrect here since the allocation risks not providing the desired alignment
if it gets promoted to alloca.

Ah, interesting point, i have not considered it.
But then we already have that problem in other places,
at least clang/lib/CodeGen/CGCall.cpp, AbstractAssumeAlignedAttrEmitter.

Apr 24 2020, 10:48 AM · Restricted Project
lebedev.ri requested changes to D78810: [InstCombine] Check max alignment before adding attr on aligned_alloc.

Thank you for looking into this.
This should instead clamp to the maximal alignment. (the 'is power of 2' check should remain)

Apr 24 2020, 8:36 AM · Restricted Project

Apr 23 2020

lebedev.ri added a comment to D68408: [InstCombine] Negator - sink sinkable negations.
In D68408#1998112, @kcc wrote:

Hi,

Hi.

Apr 23 2020, 2:40 PM · Restricted Project
lebedev.ri committed rG5a159ed2a8e5: [InstCombine] Negator: don't negate multi-use `sub` (authored by lebedev.ri).
[InstCombine] Negator: don't negate multi-use `sub`
Apr 23 2020, 2:09 PM
lebedev.ri added inline comments to D78734: [CaptureTracking] Make MaxUsesToExplore cheaper (NFC).
Apr 23 2020, 1:33 PM · Restricted Project
lebedev.ri accepted D78722: [Attributor][NFC] Encode IRPositions in the bits of a single pointer.

Seems reasonable to me, thanks for looking into it.

Apr 23 2020, 10:15 AM · Restricted Project
lebedev.ri accepted D78582: [InstCombine] substitute equivalent constant to reduce logic-of-icmps.

Same preexisting soundness problem as visible in D78430, but other than that LG.

Apr 23 2020, 2:41 AM · Restricted Project
lebedev.ri accepted D78430: [InstSimplify] fold and/or of compares with equality to min/max constant.

This looks good to me, but we have a soundness problem with existing nullptr folds, specifically
(X == null) || (X u<= Y) --> X u<= Y and (X != null) && (X u> Y) --> X u> Y

Apr 23 2020, 2:06 AM · Restricted Project
lebedev.ri requested changes to D78582: [InstCombine] substitute equivalent constant to reduce logic-of-icmps.

Patch doesn't apply for me.
Does this go ontop of git master, or D78430?

Apr 23 2020, 2:06 AM · Restricted Project

Apr 22 2020

lebedev.ri added a comment to D78659: Add nomerge function attribute to supress tail merge optimization in simplifyCFG.
  1. Tests missing
  2. Why isn't noinline sufficient, why a whole new attribue?
Apr 22 2020, 12:30 PM · Restricted Project, Restricted Project
lebedev.ri updated subscribers of D78659: Add nomerge function attribute to supress tail merge optimization in simplifyCFG.

+ llvm-commits

Apr 22 2020, 12:30 PM · Restricted Project, Restricted Project
lebedev.ri committed rGa70d2ab323a2: [NFC][InstCombine] Tests for negation of sign-/zero- extensions (authored by lebedev.ri).
[NFC][InstCombine] Tests for negation of sign-/zero- extensions
Apr 22 2020, 8:06 AM
lebedev.ri committed rG67266d879c71: [InstCombine] Negator: shufflevector is negatible (authored by lebedev.ri).
[InstCombine] Negator: shufflevector is negatible
Apr 22 2020, 5:23 AM
lebedev.ri committed rG4d44ce743781: [NFC][InstCombine] Add shuffle negation tests (authored by lebedev.ri).
[NFC][InstCombine] Add shuffle negation tests
Apr 22 2020, 5:23 AM
lebedev.ri retitled D78619: [mlir] Extended Alloc and Dealloc operations with memory-effect traits. from Extended Alloc and Dealloc operations with memory-effect traits. to [mlir] Extended Alloc and Dealloc operations with memory-effect traits..
Apr 22 2020, 5:22 AM · Restricted Project, Restricted Project
lebedev.ri added a comment to D78619: [mlir] Extended Alloc and Dealloc operations with memory-effect traits..

Updated commit message and added a short summary.

Apr 22 2020, 5:22 AM · Restricted Project, Restricted Project
lebedev.ri added a comment to D78430: [InstSimplify] fold and/or of compares with equality to min/max constant.

If there's agreement on that direction, I will abandon this patch.

but note that it would be a suspended, half-baked state,
since then we'd neither generalize the existing folds,
nor drop them.

Yes, it's fragmented. We could still remove the existing chunks shown here if we think that instcombine is probably good enough.
We have to decide if the reduction in code is worthy.

Apr 22 2020, 1:03 AM · Restricted Project

Apr 21 2020

lebedev.ri added a comment to D78430: [InstSimplify] fold and/or of compares with equality to min/max constant.

We can do more optimization with less code in instcombine, so here's a step towards that:
D78582

I agree that we should have powerful instcombine fold,

Apr 21 2020, 2:05 PM · Restricted Project
lebedev.ri added a comment to D78582: [InstCombine] substitute equivalent constant to reduce logic-of-icmps.

Does this supersede any of the existing folds?

Apr 21 2020, 2:05 PM · Restricted Project
lebedev.ri committed rG352fef3f11f5: [InstCombine] Negator - sink sinkable negations (authored by lebedev.ri).
[InstCombine] Negator - sink sinkable negations
Apr 21 2020, 12:27 PM
lebedev.ri closed D68408: [InstCombine] Negator - sink sinkable negations.
Apr 21 2020, 12:27 PM · Restricted Project
lebedev.ri added inline comments to D78538: [llvm][NFC][CallSite] Remove CallSite from DeadArgumentElimination.
Apr 21 2020, 11:52 AM · Restricted Project
lebedev.ri added inline comments to D68408: [InstCombine] Negator - sink sinkable negations.
Apr 21 2020, 11:20 AM · Restricted Project
lebedev.ri updated the diff for D68408: [InstCombine] Negator - sink sinkable negations.

Updated: adjust comments, lambda name, guard sdiv with an artificial one-use-check.

Apr 21 2020, 10:49 AM · Restricted Project