- User Since
- Apr 14 2013, 11:48 AM (318 w, 1 d)
Thu, May 16
Mon, May 13
Updated to address issues raised during the review:
Mon, Apr 29
Fri, Apr 26
Sorry if I was not clear in my description, but what I meant to illustrate is that the RA allocates the VRegs in the order I listed them. So first %21 is allocated $r2d, and then the next VReg assigned is %12, and then %13, %14, ..., %20, %21.
> %12 -> $r0d // does not see that $r2d would be better
Thu, Apr 25
It looks like this new test case fails on big-endian systems, like this:
Wed, Apr 24
Tue, Apr 23
Mon, Apr 22
Those test case changes represent an actual improvement here, so this looks good. Thanks!
Apr 19 2019
This looks really promising, in particular the reduction in spills and copies. Can you check that this also addresses the problem described here: https://reviews.llvm.org/D22011 ?
Apr 16 2019
Apr 15 2019
Not a full review yet, but a couple of comments (in the test case) where the resulting code could be simplified.
Apr 10 2019
On SystemZ, there is no dependency from InstPrinter to MCTargetDesc (or anything else in the target) as far as I can see. That means we don't have an issue, right?
Apr 4 2019
Apr 3 2019
Mar 29 2019
The SystemZ test case change is fine with me, thanks!
Mar 17 2019
Mar 4 2019
Feb 27 2019
Feb 26 2019
OK, this version LGTM. Thanks!
Feb 25 2019
Feb 21 2019
This looks generally good to me. Some additional options to possibly make the code simpler occurred to me:
The SystemZ part LGTM, thanks.
Feb 20 2019
Feb 19 2019
Thanks, this should now cover all intrinsics with immediate arguments. Some additional comments inline (mostly cosmetic, with one real fix for int_s390_vfisb).
Feb 18 2019
Could we actually handle FP128 as well with a present FeatureVectorEnhancements1?
I'm not quite sure I understand the logic why some intrinsics that require immediate arguments are marked with ImmArg, but others are not?
Feb 14 2019
Ah, good catch! LGTM.
Feb 13 2019
Hmm. Actually, I'm now wondering why we need to reject anything in the first place. Can't we improve isFPImmLegal to accept *anything* that can be constructed via any of the vector instructions (VGBM, VGM, VREPI)?
Feb 12 2019
Yes, I think this version is better.
Feb 11 2019
It seems you're assuming VGM is always available. I guess there needs to be a check for hasVector() somewhere.
Feb 8 2019
I think 1a would be the best option, indeed.
Well, for replication we definitely need proper float support. For VGBM, we could ignore floats since (except for the all-zero and all-one pattern) there aren't really any common FP constants that can be created via a VGBM pattern. But that isn't true at all for replication ...
Feb 6 2019
Now fixed in a slightly different manner in as r353304.
Yes, I think this version makes most sense. LGTM.
Jan 29 2019
Jan 28 2019
Jan 24 2019
I am not quite sure if this is the best solution, but as it is now tryBuildVectorByteMask() is used first during legalization to build a new BUILD_VECTOR with the right constants, and then again in Select() to get the same mask back again. I first thought it would be possible to just leave the BUILD_VECTORS during legalization, but then I found a case where this doesn't work which involved ConstantFP<nan>, which ended up in the constant pool.
Thanks for the heads-up! This may indeed be interesting for SystemZ, but I think it's still probably preferable to do it in the back-end like your alternative approach does, that will allow us to handle some special instruction selection issues we'll likely run into ...
Jan 22 2019
What's the reason for using SkipPHIsLabelsAndDebug instead of, say, skipDebugInstructionsForward? It's not obvious to me that skipping PHIs and labels is safe at all those places ...
Dec 20 2018
This causes test case failures due to no longer linking with -lrt on Linux.
Dec 18 2018
Dec 17 2018
Ah, right, I missed that.
Just looking at the SystemZ test cases:
Dec 15 2018
Well, mayRaise Exception is purely a MI level flag. I struggle to see where optimizations on the MI level would ever care about rounding modes in the sense you describe: note that currently, MI optimizations don't even know which operation an MI instruction performs -- if you don't even know whether you're dealing with addition or subtraction, why would you care which rounding mode the operation is performed in? MI transformations instead care about what I'd call "structural" properties of the operation: what are the operands, what is input vs. output, which memory may be accessed, which special registers may involved, which other side effects may the operation have. This is the type of knowledge you need for the types of transformations that are done on the MI level: mostly about moving instructions around, arriving at an optimal schedule, de-duplicating identical operations performed multiple times etc. (Even things like simply changing a register operand to a memory operand for the same operation cannot be done solely by common MI optimizations but require per-target support.)
Dec 14 2018
Dec 10 2018
Dec 4 2018
The one problem with your "new" code is that it now forces back-ends to implement something, or else code involving constrained intrinsics will trigger internal compiler errors. It might be preferable to avoid those ...
GCC has never supported the __float128 type on SystemZ, because "long double" is already IEEE-128 on the platform. GCC only supports a separate __float128 type on platforms where "long double" is some other type (like x86 or ppc64).
As an aside, it would be nice if we had a test case that verifies the explicit values of alignof(max_align_t) on all supported platforms. This is an ABI property that should never change.
Dec 3 2018
Nov 29 2018
Nov 28 2018
Thanks for pointing out those debug info differences, I agree that this might be a problem.
Nov 27 2018
I must have missed the earlier discussion, but I agree with @bjope 's comment earlier that subreg_h32(V0) -> F0S is actually wrong; there should not be any subreg_h32(V0) at all!