- User Since
- Nov 23 2015, 1:58 PM (159 w, 6 d)
Mon, Dec 10
LGTM. Removing the insert_subvector nesting transform is especially nice. I think it's fine to go in as is even before the zero forwarding, but I'll defer to others.
Tue, Dec 4
Thu, Nov 29
oddshuffles.ll is definitely unrelated. It seems to have disappeared post rebase. Not sure how it appears as neither this patch or its predecessor patch touched it.
Rebase to tip.
Tue, Nov 27
LGTM modulo a small nit I missed on the last pass.
Mon, Nov 26
Sat, Nov 24
Mon, Nov 19
Nov 15 2018
We've superceded this patch.
Nov 14 2018
Resolve Quentin's comments. Revert some tests which don't need changes after previous bugfix.
Nov 12 2018
Nov 9 2018
Fold in Peter's comments.
Modify move immediate check to check all register definitions are physical. Also fix typo (s/isReg/getReg) that caused all register immediate assignments to be reordered.
This appears to have resolved the isntance of PR26810 in CodeGen/X86/hoist-spill.ll, but otherwise doesn't seem to have an notable effect on code generation.
This is no longer necessary now that r346432 has landed.
Nov 8 2018
Nov 7 2018
Address comments (typos and comment fixes) and rebase.
Oct 30 2018
Modulo removing the unnecessary condition you commented on, this looks good to me. Thanks.
Simplify. Update chain after both parallelizing chain and individual chain improvements. Also use IntervalMap.
Oct 25 2018
Oct 24 2018
Oct 23 2018
It's certainly less likely than before, but it does happen in the wild, but D53552 should always parallelize these store chains, so this this should be droppable.
Oct 22 2018
which supports my supposition, but "hasOperation" seems to match your interpretation (maybe that's the right check here).
I don't think that need to avoid custom operations post-legalization. The documentation doesn't seem to address this, but as I understand it, Custom is Legal but with non-standard lowering and we just happen to lower custom node in legalize as an easy way to sure we've legalized the DAG.
LGTM modulo minor typo .
Oct 17 2018
Rebase to head. Some changes to alias make this require a much longer chain of stores. The need is clearly less than when I wrote this patch a year ago, but still has a minor positive effect.
Oct 16 2018
Fix topological ordering construction to prevent premature pruning.
Oct 15 2018
Oh good catch. This is O(N^3) for long chains because we try every prefix of the chain, and because the chain is so long our chain improvements gives up and noops, so we just redo the same computation over and over. Cutting the length to any reasonable size makes it O(N^2). I suspect your observed merge size requirement is to deal with our incomplete handling of chain SubDAG in store merge (just a children of a single TF potentially skipping some loads), because it shouldn't matter otherwise (See (1)).
Oct 11 2018
Good point. Scenario 1 may happen if we have nodes A, B, C, GlueOp, and GlueUser such that there's uses (A-> GlueUse, GlueOp->GlueUser, GlueOp->B, B-> C) and the node id ordering is GOp < B < A < GU < C then checking if A is a predecessor to C would stop searching at B and fail to see the remaining. I am unsure if this is possible currently given our id scheme, but I'll take a look. I believe it will always bias towards selecting A before B.
Oct 10 2018
Oct 9 2018
Rebase to tip.
Rebase to tip and address rksimon's renaming suggestion.
Oct 3 2018
Oct 2 2018
Simplify tests. Add comment.
Oct 1 2018
Sep 28 2018
Update to better match GCC's behavior. Not as aggressive as we can be.
Sep 26 2018
Sep 25 2018
Sep 24 2018
It looks like you may not have commit access. Would you like me to commit this for you?
Sep 20 2018
Sadly, no clever ideas on my part. If it's not repeatable on any of the in-tree backend let's just commit this now; it's a fixes a clear oversight.
Do you have a test case?
Sep 19 2018
I've gone through and marked all the places.
Sep 18 2018
The codesize issues are minor and shouldn't hold this patch up. The only blocker I see is the unnecessary data shuffling for SSE41 codegen which someone else should decide on.
Looks google fine modulo the noted issues with load.
Sep 17 2018
Yes, this is a fix to match GCC's register assignment for a 64-bit in
32-bit mode to pairs of registers.