- User Since
- Oct 23 2013, 6:32 PM (234 w, 3 d)
Tue, Apr 10
I am concerned that you doing too many things in one patch and that the mix of refactoring and functional changes may contain hard to spot bugs. In particular, the absence of hasValue checks in various updated logic worries me.
Thu, Mar 29
Fri, Mar 23
Mar 20 2018
Mar 16 2018
Fix Anna's comments.
This is a good idea. I'd copied the basic structure of this from the mem deref printer, but I like the annotation version better. Mind if I submit this one (so that I can start getting some basic testing around this code), and then return to the annotation as a follow on? I'll change over mem-deref while I'm at it.
Mar 15 2018
Note: The variant I committed turned out to be less powerful in one particular case than the reviewed code. I spotted that after submission and fixed it in revision 327655. The fix also improves the invariant.start case which had an analogous problem.
Mar 14 2018
It was pointed out that upstream support for dumping the whole module already exists. It's spelled as: -print-after-all -print-module-scope
address Max's comment. I didn't manage to actual test the multiple latch case through LICM though, see attempt and comment.
Mar 9 2018
address Anna's comments
Clarify a couple of comments.
Mar 8 2018
Feb 23 2018
Feb 17 2018
LGTM provided that the backend changes to take the min of the two have already landed.
Feb 15 2018
Just to second this, I think there's a lot of potential gain to be had by being able to select cold blocks for size and hot blocks for speed. As stated, definitely off topic for this review though!
Feb 2 2018
Jan 29 2018
Is it worth considering doing this as a late peephole? (i.e. after register allocation if the register happens to be appropriate?)
Jan 22 2018
I generally like the direction of the API. Again, not really a qualified reviewer, but I added a bunch of small comments on readability of the API choices from a non-expert. Making this stuff more discoverable would be a major plus and this feels like a potential major step in that direction.
Not really a qualified reviewer for this, but I'll comment this reads more cleanly to someone unfamiliar with the details of the code.
Jan 20 2018
Also, LGTM since I did have time to look through this time. :)
In a case like this where I only had minor comments, please don't wait for me. If davade things this is good to go, run with it. I tend to be very busy and blocking anyone's forward progress on my availability is a bad idea.
Jan 18 2018
Jan 17 2018
Bunch of minor comments, but once those addressed, likely good to go.
Marking to get off my queue.
LGTM w/minor comment applied.
Jan 12 2018
Much closer to reasonable, likely one or two more iterations.
A few minor changes needed, but generally looks close to ready.
Jan 11 2018
Should we consider changing that? We can always revise the LangRef to allow constants if that simplifies the implementation.
Jan 10 2018
Meta comment: This change is too large *conceptually*. You need to break it apart in a way that the test changes are "obviously correct" because there's no possible change in behaviour. At the moment, it a behaviour change might be lurking in the test changes and that'd never be seen.
Reading along for my own education, feel free to ignore if I'm way off base.
Jan 8 2018
This looks like the right approach to me. I think the previously expressed concern was about making the IR representation overly specific to what current hardware supports. This looks largely orthogonal to me and much less confusing as well.
Jan 5 2018
A design variation on this which may be worth considering is to phrase this as a speculative use barrier. That is, don't include the load at all, simply provide an intrinsic which guarantees that the result of the load (or any other instruction) will not be consumed by a speculative use with potential side-channel visible side effects.
To be explicit, you can ignore statepoints and patchpoints for the moment. As the only major user of this functionality, we'll follow up with patches if needed. We're still in the process of assessing actual risk in our environment, but at the moment it looks like we likely won't need this functionality. (Obviously, subject to change as we learn more.)
Jan 3 2018
Jan 2 2018
Catching up on old review traffic. Nothing critical.
Don't have time to get back to this.
Dec 30 2017
Dec 29 2017
Dec 26 2017
Dec 21 2017
Second attempt, this time accounting for sub-registers
Dec 17 2017
Much cleaner, thanks!
Dec 13 2017
Much cleaner, thanks!
Adding a machine pass which just called analyzeBranch on each BB and printed the result would be straight-forward if you wanted to cleanup the testing.
Dec 12 2017
The code change here is a clear improvement. Given no response from the test case author, LGTM.
Dec 11 2017
Dec 5 2017
I'm very strongly of the opinion this patch should not land, ever. I don't believe this change is desirable or necessary. If I'm reading through the history here correctly, what we're worried about is a two phase transform: 1) we replace a load with a phi of previously stored values, and then 2) we replace that phi with a single add instruction. It's specifically that *second* transformation which exposes the worried about miscompile. My argument here is simple: a test case with a *manually written* phi in the same place as our PRE based implementation of FRE inserts one would expose the same miscompile. Given this, we might very well have a bug in whatever bug does the second transform, but there is nothing wrong with the FRE/PRE here.
Dec 4 2017
small drive by comments.
There are a couple of NFC parts that need extracted here before the semantic change can be easily reviewed.
LGTM w/changes applied.
LGTM w/comments addressed before submit.
Dec 1 2017
Nov 27 2017
Initial round of comments focused on the metadata semantics only. I completely skipped the instcombine changes.
Nov 21 2017
Nov 14 2017
Max and I discussed and realized this comes down to simply needing a much simpler and straight-forward comment. :) If AR is not defined on entry to L, it can't be LoopInvariant with respect to L. The distinction between L being a parent of AR->loop vs a sibling vs a nephew, cousin, etc... really doesn't contribute anything here.
Nov 7 2017
Structural suggestion: pick either of the two subtasks and implement them in isolation. That is *either* replace the ConstantRange with an owned ConstantRange allocation and implement the tagged pointer OR implement the ConstantRange folding set. Doing both in one change is needlessly complicated. Once one is done, the other follows more naturally. I'd suggest the first then the second, but either order is defensible.
LGTM once dependent changes land.
relatively minor code comments.
Max, I don't quite follow your discussion enough to know for sure, but I suspect there's something wrong with your framing here. A induction variable in one loop may be loop invariant with respect to sibling loop. That's normal. Is there maybe a subcase here? Or are you looking at flawed assumption in caller code?
LGTM, but please land a day or two separate from any other change. i.e. if this exposes problems (pass ordering?) let's make that obvious.
LGTM once dependent revisions have been in tree for a couple of days.