- User Since
- Aug 10 2016, 1:07 PM (141 w, 6 h)
Well, I guess there are sort of two potential approaches to emitting it "later". One, we could emit something like ".word (endfunc-beginfunc)/4+0x10800000", as long as we have some conservative estimate we could use to prove that we don't need to split the unwind data. The other would be to use a dedicated, relaxable fragment, which could grow to accommodate splitting the unwind data if necessary. The dedicated fragment is probably a bit more bulletproof, and more consistent with the way we handle relaxation in other contexts, but it would require a lot of refactoring to implement.
For NotPod, it is aggregate which is specific in the document
It looks like there's some missing documentation in the ARM64 ABI document involving homogeneous aggregates... in particular, it looks like non-POD types are never homogeneous, or something along those lines. I guess we can address that in a followup, though.
LGTM. It seems easy to trigger false positives with this lint, so I'm not sure how useful it is in general, but this is clearly an improvement.
Just quickly reviewing the target-independent changes.
Partially addresses review comments. Still need to think about the mapping symbol vectors a bit more.
Is it possible to enable fast-isel and/or global-isel for ARM64 Windows? Do we need additional code for them? (I'm okay with making those changes separate patches, but it should be easy to at least add testcases now.)
Mon, Apr 22
that's a case I don't think can be done in C even with GNU C extensions
Do you need to modify AArch64TargetLowering::isEligibleForTailCallOptimization to prevent a tail call in cases where the tail call would return the wrong value?
In terms of how to generally improve disassembleObject, I guess there are a few different ways to think about it...
so the easiest way to fix this bug is to remove !LegalOperations before ||
Fri, Apr 19
In former condition, such transformation is acceptable even operation ADDE is illegal and it only cares about type legality because illegal operation will be legalized later in DAG legalization phase
Thu, Apr 18
Oh, I see, we start with an i32 ADDE, then https://github.com/llvm/llvm-project/blob/e7fe6dd5edb828e702d02c7cdc6565ace4c84f5b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L10232 narrows the operation. We probably shouldn't perform that transform if it would produce an illegal ADDE.
The constant cases look better, but I'm not sure if this is a win if both operands are variables.
I guess that's right, technically; "getScalarSizeInBits() / 8" will round down, and getStoreSize() will round up. But still, please fix the code so it's clear that check is happening.
The use of "?" here doesn't seem right. All those instructions where you're adding a "?" do in fact require that the bit in question must be zero.
This fix is wrong. For vectors where the element type is not byte-sized, loads can't be "consecutive" in the sense of this function; the store size of <2 x i1> is one byte.
ADDC etc. are deprecated in favor of ADDCARRY. Nothing should generate ADDC etc. unless they're marked "Legal" for the type in question... and they should be Expand by default.
Wed, Apr 17
Mon, Apr 15
I wasn't thinking in terms of keeping the addressing mode legal, just avoiding destroying the work of constant hoisting. Constant hoisting won't split a constant in the way you're suggesting. And it's relatively easy to write patterns to split "load x+c" in the most efficient way if "c" has a single use. I guess the more restrictive version would allow splitting single-use constants as a DAGCombine, or earlier? Not sure why you'd want to, though.
Though I modified the avx512 intrinsics to not have masking built in.
Fri, Apr 12
I can't find any previous discussion of this particular check. I agree it isn't necessary for correctness.
Tue, Apr 9
Mon, Apr 8
(Probably should be using setPreservesCFG().)
Fri, Apr 5
Do we not have some existing file we've been using for fp16 tests?
Probably better to put it inside the branches for ADD/SUB, to make it clear that it's those specific operations that have a base register?
Should we care about the number of uses of N0 here? It should generally be "safe" to reassociate in cases where it only has one use.
Thu, Apr 4
That layout generally makes sense, sure. Maybe you could get away with one bit per slot by sticking a bit to distinguish between either empty/tombstone into the main table, but I guess it's not really a big deal either way if you're talking about one or two bits per entry. I guess two bits makes failed lookups faster.
Adding llvm-commits to the CC. Please be more careful about that in the future... see http://llvm.org/docs/Phabricator.html
Why do we need to rotate the loop before unrolling? llvm::UnrollLoop currently refuses to unroll loops where the latch is an unconditional branch, but that isn't a fundamental limitation, as far as I can tell. We already support unrolling loops where the latch is not the exit branch; allowing loops where the latch doesn't exit at all is a minor extension. Granted, it might be more efficient to explicitly rotate the loop before unrolling, so we don't clone quite so much code.
Wed, Apr 3
What's the alternative? We can't dictate the behavior of target-specific constraints from target-independent code; some targets actually have boolean registers.
Tue, Apr 2
Mon, Apr 1
Spent a bit more time looking, and thinking, and I think I have a clearer picture of what's happening.
I'll look a little more at D58746, but that seems more like the right direction
There are no multiple threads, etc. on msp430
Yes, please refactor the code. We define a bunch of helpers in ARMBaseInstrInfo.h; that's probably fine.
I'm not sure where or if there are official docs for libm
Thu, Mar 28
Update test to use -verify-machineinstrs, and to use CHECK-LABEL in a way that's less likely to cause confusion.
Added a couple more tests, including one that generates the subs+sbcs+lsls sequence.
This isn't correct; GlobalMerge is not legal for symbols with weak linkage, and there isn't any way to correctly emit an alias with "common" linkage in ELF.
Wed, Mar 27
LGTM. Thanks for sticking with this for so many rounds of review.
Tue, Mar 26
When I tried your example it was pushing r7 and popping r3. I presume this is OK, providing r7 isn't modified?
For that particular case, I think we could suppress the false positive using DiagRuntimeBehavior. How many total false positives are we talking about, and how many can the compiler statically prove are unreachable?