- User Since
- Nov 16 2012, 6:02 PM (312 w, 4 d)
Mon, Nov 12
Wed, Nov 7
Fri, Nov 2
I'm not very familiar with the MemorySSA pass but based on a fairly quick skim of it ...
It looks like the MemoryDef/MemoryUse objects are attached to the BB rather than the instructions.
Thu, Nov 1
It's unclear to me how LTO (or other cross file inlining) would work here. I haven't given it much though until now. My knee-jerk reaction is that we shouldn't be inlining from a FENV_ACCESS=OFF to FENV_ACCESS=ON location.
Somewhat unfortunate amount of boilerplate for each target AA pass, but I don't have a better suggestion right now.
Wed, Oct 31
Thu, Oct 18
Wed, Oct 17
Add some optimization remarks. Now we'll get things like his:
Fix variable initialization so that we don't infinte-loop when we refine.
Add the pointer-capturing check when potential exits are found.
Mon, Oct 15
Oct 11 2018
Currently, there's sort of a split in the LLVM source code about whether volatile operations are allowed to trap; this resolves that dispute in favor of not allowing them to trap. (This is the opposite of the position I originally took, but thinking about it a bit more it makes sense for the uses we expect.)
Oct 10 2018
Oct 9 2018
Oct 8 2018
Oct 6 2018
Oct 5 2018
Oct 4 2018
I'm not sure where it's documented, but MMOs are definitely not guaranteed to be preserved (except in global-isel, the verifier enforces this for some of the G_* instructions). I'm not aware of anywhere specifically dropping them, but code is not required to preserve them.
Sep 25 2018
Now that we understand the reason for the only degradation, we can turn this pass on by default. A subsequent patch will detect and remove self-copies in the pre-emit peephole.
Sep 21 2018
Is there any compile-time impact? Should we just simplify all instructions at this later point instead of having two passes?
I'm somewhat confused by the motivation. Can you explain why the fact that the operands are commuted causes a problem. In D51995, you just seem to be looping over them anyway.
Sep 17 2018
disable this fold when comparing noalias ptr with null
Unfortunately, I don't think that we can do this. noalias semantics only provide information about the overlap of memory accesses using the resulting pointers, in the context of other necessary constraints, and does not strongly constrain the pointer values. For example, it might be that:
Sep 11 2018
Aug 27 2018
Aug 25 2018
Does this mean that we can remove that previously-discussed x86 logic in Clang?
Aug 23 2018
Aug 17 2018
Also, if you know, should this actually be the default behavior for all targets? I'm somewhat surprised if x86 and PPC are special cases.
Interesting. Why? Does this effectively match GCC's behavior?
Aug 16 2018
Another option would be to implement some sort of attribute-based overloading. Then OpenMP can provide its own version of the device-side library function without clashing with system headers.
Aug 15 2018
Aug 14 2018
Aug 13 2018
This code certainly looks cleaner.
The DSE test is fine, and I think that you should keep it, but can you please add a direct AA test (like, e.g., test/Analysis/BasicAA/cs-cs.ll)?
Makes sense to me. Obviously needs some test.
A couple of minor comments, but otherwise, LGTM.
Aug 6 2018
Aug 5 2018
Aug 4 2018
Seems like an interesting idea. What's the motivation for doing this?
Aug 3 2018
... As such, how to fall back when the transformation doesn't happen is almost equally important as what to do next when the transformation happens.
Aug 1 2018
Assuming that @aaron.ballman is still okay with this, I think that we should move forward. @erichkeane, I suspect that fixing the ordering problems, which were often arbitrary before and are still so, is a larger change that we'd want to split out regardless (not to mention the fact that we'd need to decide how to fix them). Erich, is that alright with you?
Jul 31 2018
Jul 30 2018
Jul 19 2018
Jul 16 2018
Jul 15 2018
Jul 14 2018
There's nothing special about debug intrinsics here except that there are a lot of them. The problem, as far as I can tell, is that we're repeatedly using dyn_cast on each instruction and doing multiple redundant tests. Adding yet another redundant test will help having a lot of debug intrinsics makes things incrementally more expensive for all other kinds of intrinsics. Thus, this doesn't seem like the right way to fix this. Instead of testing for IntrinsicInst, and then CallInst (which is always true whenever the IntrinsicInst is true), and then we always test for StoreInst (but we don't use an 'else' so we always do this test even when the IntrinsicInst/CallInst is true (which will include debug intrinsics)).
Jul 13 2018
We have a check in aliasSameBasePointerGEPs to avoid PR32314 (which I'll glibly summarize as: we need to be careful about looking through PHIs so that we don't, without realizing it, end up comparing values from different loop iterations). The fact that this looks back through an unknown number of PHIs makes me concerned that we'll get into a similar situation.
Misc cleanup and try to validate ret attrs instead of disqualifying them all.
Jul 12 2018
Updated to not infer speculatable if there are instructions with non-debug metadata, loads with alignment requirements that we can't independently prove, and for functions with value-constraining return-value attributes.
LGTM (there are a lot of changes here, but given that it produces no changes to existing matching tables, that seems like pretty good test coverage).
Jul 10 2018
I specifically didn't want to do this for the YAML output. The tool consuming the YAML should demangle if it wants. It's hard to go backward, and so if the tool needs to map back to symbols in the program, it needs the mangled name. That having been said: