- User Since
- Aug 10 2016, 1:07 PM (45 w, 3 d)
Fri, Jun 23
How do we end up in a state where the block invocation function is referenced external to the TU?
I think we just use "0" as a sentinel value to indicate the block doesn't need a mangling number. getLambdaManglingNumber works the same way? See CXXNameMangler::mangleUnqualifiedBlock in the Itanium mangler.
I'll think about the SelectionDAGBuilder::visit issue myself, and maybe come up with a followup patch.
Thu, Jun 22
Missing testcase... although, I'm not how you would write one.
@efriedma hmm...using getBlockManglingNumber causes the name to be duplicated. Ill look into that.
I think this is a fairly uncommon scenario that size increase is not a concern in general,
LGTM with the comment fixed.
A lot of these combines look very similar to the work done for ISD::ADDCARRY/SUBCARRY; see https://reviews.llvm.org/D29872 . Can we reuse that work rather than add a bunch of ARM-specific code?
There's a isDereferenceableAndAlignedPointer helper which does this sort of check. See also https://reviews.llvm.org/D34467.
Patch is missing context.
This seems likely to lead to surprising behavior. LastCallToStaticBonus is very large, so we'll end up duplicating a lot of code in some cases, which could be surprising. On the other hand, it isn't infinitely large, so you can't depend on always_inline to trigger the behavior reliably.
it looks like it doesn't easily support long double
Tue, Jun 20
Yes, this is what I was meant; thanks.
If we get to "memcpy(z, y, 10);" without "memcpy(y, x, 10);" I'd expect we don't care if "y" is uninitialized bytes or global constant. We will have no buffer overflow which I am trying to fix.
AssertingVH catching a bug. Cool. :)
Hmm, actually, thinking about it a bit more, I'm not sure this patch is right.
Mon, Jun 19
Testcase? (I think this affects inline asm.)
Sorry about the delayed review.
I wonder if it makes sense to more aggressively transform the IR in certain cases rather than bail out, since we can prove that the IR is reading uninitialized memory. Something for a future patch, I guess, if someone has a testcase where it matters in practice.
I just tried your updated patch, and I'm seeing crashes in make check. (Assertion Ret.getOpcode() == ISD::MERGE_VALUES && "Ret is merged value node comprising constituents stack" "locations holding result"' failed.`)
Fri, Jun 16
Any idea what effect this has on performance, if any?
Umm, that misses the point of my comment; we want to reduce the number of paths through the code by requiring that Operands is never empty (except for operations which don't have any operands).
I apologize, my review last night wasn't very insightful. Hopefully better comments today.
Thu, Jun 15
Please make sure to put llvm-commits in the subscriber list when you post a patch.
Wed, Jun 14
Changing it in this patch would be considered as "unrelated change".
Needs testcase. (-verify-machineinstrs will run the verifier even if it's off by default for your target.)
mayReadOrWriteMemory() just checks for the "readnone" attribute (http://llvm.org/docs/LangRef.html#function-attributes). If it's set, LLVM knows the call can't read from or write to memory; if it isn't set, the call might write to memory, so mayReadOrWriteMemory() conservatively returns true.
Can't you just tell LVI the blocks are dead without them being dead?
If the issue in JumpThreading is determining reachability at all times
Tue, Jun 13
Mon, Jun 12
Did you do an analysis of whether this produces the same counter values for single-threaded code?
Looks okay, but I haven't been reviewing this series of patches closely, so I could be missing something.
I'm not convinced it makes sense to invest any effort into loops with a backedge-taken count of zero. Yes, maybe we simplify the exit condition slightly earlier, but does that actually do us any good in practice?
Scalar evolution can already analyze the given testcase on trunk. Output of "opt -analyze -scalar-evolution":
Fri, Jun 9
Custom-lowering the constant pool to a global isn't that terrible, I guess... it's ugly, and we can't share the globals across basic blocks, but it should behave correctly, as far as I know.
Thu, Jun 8
LGTM with one minor tweak.
Wed, Jun 7
Oh, sorry, about the delay, I lost track of this.
Tue, Jun 6
LGTM. (We could match this in DAGCombine, but probably not worth bothering.)
Please start a thread on cfe-dev about this; most developers don't read cfe-commits, and thinking about it a bit more, I'm not confident omitting frame pointers is really a good default. I would guess there's existing code which depends on frame pointers to come up with a stack trace, since table-based unwinding is complicated and takes up a lot of space.
Most targets keep the constants in the MachineConstantPool, and emit them with AsmPrinter::EmitConstantPool. I would prefer to align with the way other targets handle constant pools, if it isn't too hard.
We don't current support MemIntrinsicSDNodes with more than one memory operand at the moment... but I don't think that would be hard to change. Probably worth fixing if you need it.
You don't need to ask for review for a trivial change like this.
Mon, Jun 5
I'd like to see a response from Chandler, since he was the last one to touch this code.
fadd NaN, undef -> undef isn't legal: any fadd involving NaN always produces a NaN. Similar reasoning applies to most other floating-point values.
The diffs to the ARM tests are clearly no good: you're splitting 128-bit vector loads into two 64-bit vector loads for no benefit.
Thu, Jun 1
Looks fine, but please fix the commit message to note that -reverse-iterate only exists in +Asserts builds.
There are a bunch of places in SelectionDAG which generate constant pools for obscure constructs. Maybe we need a separate TargetLowering hook to suppress constant pools in general, rather than just suppressing this particular DAGCombine? (grep for getConstantPool in lib/CodeGen/SelectionDAG/ .)
Correctly counting columns is a bit more complicated that that... for example, consider what happens if you replace ideëen with idez̈en. See https://stackoverflow.com/questions/3634627/how-to-know-the-preferred-display-width-in-columns-of-unicode-characters .
Consider something like this:
Wed, May 31
Okay, in that case the compile-time sounds fine.
The other alternative is that you could pass in an AttributeList (the result of getAttributes()) to the functions... that makes it more clear what exactly is getting used, but it's more complicated (and probably involves a bunch of refactoring to expose a function to check isNoBuiltin given a Function/AttributeList pair).
Would it be possible to pass in a CallSite to canConstantFoldCallTo, ConstantFoldCall, and SimplifyCall, to reduce the number of places we check this?