- User Since
- Jul 7 2012, 2:54 PM (275 w, 4 d)
Add a test case covering critical edge splitting.
Add a stricter form of checking for profitability to avoid one potential issue
with this patch. Also clean up comments in various places to try and help
address code review feedback.
Hopefully responding to all of the comments. Note the last one which is probably the most interesting point of discussion around cost modeling.
Mon, Oct 16
Update addressing the remaining comments from Sanjoy.
Updates and/or responses. Largely addressed all the comments with the next patch upload.
Sun, Oct 15
Rebase onto the new LoopInfo API and get everything working. Also rebase into
a monorepo checkout.
Fri, Oct 13
This patch got reverted again, and I wanted to provide some insight as to why...
Thu, Oct 12
Wed, Oct 11
Tue, Oct 10
Have you considered building a ChunkedVector instead of a ChunkedList? Specifically, there is a great trick where you use a single index with the low bits being an index into the chunk and the high bits being an index into a vector of pointers. It has many of the benefits you list and is a bit simpler I think. It also supports essentially the entire vector API if desired. Both bi-directional and even random access are reasonably efficient. Good locality, etc.
Sat, Oct 7
Thu, Oct 5
(quick responses, still working on code updates from the review)
I've taken some time to go through the code in NewGVN that computes similar things based on the suggestion from Danny. These now do *very* similar things in terms of walk. The differences I've seen are:
Wed, Oct 4
FWIW, I think with the latest patch update this has now addressed Danny's primary concerns, and should be much more effective at avoiding re-visiting regions of IR.
Rebase (and move to mono-repo layout, sorry for any disruption).
Tue, Oct 3
Mon, Oct 2
Is there any way that the debug instructions can become invalid due to this transform? I'm not seeing anything obvious, but wanted to make sure you've thought about this too.
I generally like the direction of the cleanup. A few high level questions:
I'm generally fine with this going in given the measurements.
Sun, Oct 1
Wed, Sep 27
Tue, Sep 26
(Marking as accepted now that Easwaran has had a chance to look as well)
This is looking pretty good. Some nit-picky improvements below. I'd love for Easwaran to get another chance to look at this too though...
Fri, Sep 22
Mostly minor suggestions below...
Thu, Sep 21
Can you revert the unrelated formatting changes? Otherwise reviewing this is a bit hard.
Could you include analogous updates to the new PM's pipeline? (I can do it myself in a follow-up if you'd rather...)
Wed, Sep 20
Rebase (no substantive changes yet...)
Tue, Sep 19
Generally looks good. Description should likely be updated to reflect changes made during review. Feel free to submit!
I suspect Reid is in a much better position to look at debug info changes to SROA than I am, but let me know if I can help...
(marking as needing changes to clear dashboard)
Marking as needing changes to clear dashboard.
Just wanted to say thanks so much for working on this -- its a huge improvement to the inline cost.
Sorry for the awful delay when I got pulled into stuff, should be able to stay with the (small) stuff left in the review.
Generally like the API changes here. The enum seems a big improvement.
LGTM, nits below.
But, the verifier itself will just "crash". It won't print a stack trace, but I don't see why that's much better? And this flag is supposed to be a developer option and not a user facing one, so I'm somewhat confused at what the intent is here...
Sep 15 2017
Sep 14 2017
Sep 8 2017
I generally like the direction, peanut gallery comment. No need to wait for me to land or anything...
So, I'm still parsing your first comment Danny, but seeing the second, I think there may be some confusion here...
Sep 7 2017
Update based on code review from David and Hal.
Rebased, fixes included. Anything else before I land this?
Rebase, including Craig's fixes.
Sep 6 2017
Sep 5 2017
Aug 29 2017
A quick skim of Agner's suggsets that all of Core2, Nehalem, Westmere, Bulldozer, Piledriver, Steamroler, and Zen all do nearly full blown macro fusion for cmp/test and a branch. Even the Via Nana apparently does some of this apparently....
Put differently, maybe we only want this on processors *older* than sandybridge... but today, we don't. and we don't have a lot of tuning for older processors... so maybe we should just nuke it going forward.
I'm fine with this, but it would be good to get confirmation that there is a genuine partial flag update stall issue we're avoiding with this... As long as you all can confirm that this is a real issue even on modern processors, then all this makes sense. =D
Aug 26 2017
Generally speaking, I love this kind of thing. I think it makes tests and the actual assembly dramatically easier to read.
Aug 25 2017
Comments addressed, thanks!
Update with minor formatting fixes and a major fix to correctly detect usage of
CF from an operation and suppress toggling between ADD and SUB in that case.
Switch to a more verbose (more lines of code) pattern for selecting opcodes in
the ADD and SUB block. This will become more important when adding AND, OR, and
XOR which all want the same behavior.
Clean up some stale cruft in the test.
Aug 24 2017
Another update from review.