- User Since
- Aug 10 2016, 1:07 PM (166 w, 6 d)
Mon, Oct 21
For FCOPYSIGN, specifically, LegalizeDAG is going to query the target to ask, "is FCOPYSIGN legal for result type X", using the getOperationAction() API? The target has a few different ways to respond: here, the relevant possibilities are "Legal", "Expand", or "Custom". I guess the issue here is that on RISCV, this query is returning "Legal", when it actually isn't legal for all combinations of legal result/input types?
Fri, Oct 18
I was considering whether, instead of making getCalleeSavedStackSize approximate the stack size when running an individual pass, we should serialize it in the MIR. But this is probably okay as-is.
For the LSR thing, see https://reviews.llvm.org/D68592 .
Hopefully someone more familiar with LoopDataPrefetch can review, but this looks reasonable.
Thu, Oct 17
(Merged to the branch.)
Removed redundant check. Added support for x86.
Wed, Oct 16
I do believe that we need it for all the targets (x86, x86_64, armv7, aarch64).
Tue, Oct 15
If that doesn't happen, then they are 'trivially' dead and deleted.
LGTM, assuming you fix your tree so test/MC/AsmParser/preserve-comments-crlf.s isn't modified.
Mon, Oct 14
Oh, that makes sense. But needs a comment on the pattern, and in cvtThumb2LDRB_PRE/POST.
Bug2: test case provided above. Please Lewis take a look at how this case can be fixed.
Hmm, that's a bit more complicated than I hoped it would be... but I don't see any obvious way to simplify it.
It looks like this is still waiting for review comments to be addressed.
Handling what Wei's case will be a nice thing to have, but it may require more significant change in JT.
I'm inclined to defer doing that for now, if that's OK :)
Adding aliases seems like the right approach; we have aliases for a bunch of other instructions.
Fri, Oct 11
Hmm, the ObjC changes were simpler than I expected. And you managed to avoid making changes to overload sets.
If the terminator is a "ret", or some arbitrary terminator that doesn't simplify, it's not really "threading"; it's just tail duplication. That's likely profitable in some cases, but using ThreadEdge to perform the transform seems confusing.
Thu, Oct 10
It looks like llvm-commits wasn't CC'ed, somehow.
Wed, Oct 9
Found another broken case. Fix, and add more test coverage.
The sequence from https://reviews.llvm.org/D67986#1692153 works, I guess.
(b) is an issue, as pointed out in https://lwn.net/Articles/726587/ (grep for valgrind) : from valgrind point of view, accessing un-allocated stack memory triggers error, and we probably want to please valgrind
Doing the call *after* the stack allocation is also not an option, as a signal could be raised between the stack allocation and the stack probing, escaping the stack probe if a custom signal handler is executed.
Sorry, I meant the "probe-stack" attribute.
Is there some reason this isn't using the existing stack-probe-size attribute?
The refactoring of the getCurrentMangleNumberContext API mixed with functional changes makes this much harder to review quickly. Please separate it out.
Tue, Oct 8
Thanks for pointing that out.
Mon, Oct 7
I know this contradicts existing code in SCEV that tries not to break LCSSA, I'm wondering if we should just remove those in favor of teaching SCEV expander about preserving LCSSA.
we want freeze to be fully agnostic, it should not care *at all* what the type is, right?
Is this patch still relevant?
This was reverted.
Fri, Oct 4
I don't understand what the check is supposed to be doing in the first place. If we're in LibCallSimplifier::optimizePow, we've already checked that the call is actually calling a function with the right semantics, not just the name.
This *is* unusual. Is this too ugly to live?
Formally, whether we lower the "fcmp" instruction using vcmp or vcmpe shouldn't matter; the difference only affects floating point state the code isn't supposed to access anyway.
Wed, Oct 2
You could do something like if (!memccpy (d, s, '\0', size - 1)) d[size - 1] = 0;, I guess. That's a few more instructions that I'd really like, but I don't see a better alternative.
-fno-builtin-memccpy doesn't work because memccpy isn't listed in include/clang/Basic/Builtins.def. We should probably fix that.
Oh, hmm... actually, just realized something; is the new formulation actually equivalent? Specifically, is it okay to write to "d[size - 1]" if the string is short? Say, for example, you have something like snprintf(buf, 10, "%s", "x"). That normally writes to two elements of the array: 0 and 1. The rewritten version writes to three elements: 0, 1, and 9.
It sounds like memccpy is part of C20. Can we not do this transform unless LangOpt says we're C20 or greater? Otherwise the Linux kernel doesn't implement this routine.
Tue, Oct 1
Mon, Sep 30
I think you've basically landed all the significant pieces of this already? But sure, LGTM.
Yes, that's right. Thanks for spotting that.
Thu, Sep 26
We dont even need to know n. If c is not in src, we have just memcpy(dst, src, n), no?
(Comments using argument names from the signature void *memccpy(void *dest, const void *src, int c, size_t n);.)
Wed, Sep 25
Obviously ignoring the attribute is correct; signext is just an ABI attribute which has no semantic effect at the IR level. But why was the code to check it added in https://reviews.llvm.org/D61381? I assume it was supposed to be some sort of profitability check.
Tue, Sep 24
Does this have any significant impact on -fsyntax-only performance?
Do we have accurate TLI information for memccpy? Not sure exactly how widely available it is.
I think I see the gap in communication here.
Would it make more sense to just say this isn't a triangle? We don't really want to do loop peeling in IfConversion.
If SelectionDAGLegalize::ExpandNode is expanding SIGN_EXTEND_INREG of i1 to a non-canonical form, maybe it makes more sense to fix that, instead of trying to avoid that codepath? I mean, this is fine, given we know it's going to get expanded anyway. But the same problem might come up again later out of target-independent code generating an i1 SIGN_EXTEND_INREG.