Page MenuHomePhabricator

uweigand (Ulrich Weigand)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 14 2013, 11:48 AM (309 w, 4 d)

Recent Activity

Sun, Mar 17

uweigand added a comment to D59363: [SelectionDAG] Add icmp UNDEF handling to SelectionDAG::FoldSetCC.

I think the SystemZ test changes look ok (by replacing the undef operands with an argument the tests are more or less unaffected by this patch), but as usual I will let Uli do the formal approval.

Sun, Mar 17, 2:31 PM · Restricted Project

Mon, Mar 4

uweigand added inline comments to D58884: [DAGCombiner][X86][SystemZ][AArch64] Combine some cases of (bitcast (build_vector constants)) between legalize types and legalize dag..
Mon, Mar 4, 6:09 AM · Restricted Project

Wed, Feb 27

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Well, I have a question about why is it not enough to use use/def of physical register in MI level to prevent the reordering of instructions about set/get fp status register? For example, If the instruction which setting rounding mode, and then the instruction which read the rounding mode, the sequence would not be reordered because of the dependency of rounding mode bits (fp status register).

This would be enough to handle rounding-mode aspects. But if FP exceptions are enabled, we have additional restrictions: all FP exceptions could then have additional side effects (trap) and therefore cannot be executed speculatively, even where that would be OK from a status register dataflow perspective.

Also, one goal of the design is that we do not have to duplicate all FP instruction patterns in the back-end; we want a single pattern that can handle both the strict and non-strict case. If we wanted to a phys reg def, that would then have to be optional in some form, and there's no good way I can see to do this in the current infrastructure.

OK. If we use tablegen multiclass to define two kinds of instructions is not too much work because of multiclass. One for strict fp pattern and one for normal non-strict although they will be mapped to same instruction in some targets but with different flags(one has hasSideEffect flag, one hasn't) or phys reg def/use?

Wed, Feb 27, 4:45 AM

Tue, Feb 26

uweigand added a comment to D58521: [DAGCombiner] allow truncation of binops after legalization if desirable.

I tried this patch on SystemZ / SPEC, and as before this seems to have a relatively very minor impact on the number of files changed (7), and on the performance (seemingly unaffected).

I think the SystemZ test looks good, but I leave the final approval to Uli as usual.

Tue, Feb 26, 2:27 AM · Restricted Project
uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Well, I have a question about why is it not enough to use use/def of physical register in MI level to prevent the reordering of instructions about set/get fp status register? For example, If the instruction which setting rounding mode, and then the instruction which read the rounding mode, the sequence would not be reordered because of the dependency of rounding mode bits (fp status register).

Tue, Feb 26, 2:27 AM
uweigand accepted D58270: [SystemZ] Load all vector and FP constants in Select() .

OK, this version LGTM. Thanks!

Tue, Feb 26, 1:25 AM

Mon, Feb 25

uweigand added a comment to D58270: [SystemZ] Load all vector and FP constants in Select() .

Yes, that seems to simplify things. I however am somewhat wary to create DAG nodes during all the queries during legalization, so I instead first make a vector of unsigned and then make the actual DAG operands in loadVectorConstant().

Mon, Feb 25, 3:55 AM

Thu, Feb 21

uweigand added a comment to D58270: [SystemZ] Load all vector and FP constants in Select() .

This looks generally good to me. Some additional options to possibly make the code simpler occurred to me:

Thu, Feb 21, 4:10 AM
uweigand added a comment to D58490: [ARM] Be super conservative about atomics.

The SystemZ part LGTM, thanks.

Thu, Feb 21, 3:35 AM · Restricted Project

Wed, Feb 20

uweigand accepted D58353: SystemZ: Add ImmArg to intrinsics.

LGTM, thanks!

Wed, Feb 20, 7:29 AM

Feb 19 2019

uweigand added a comment to D58353: SystemZ: Add ImmArg to intrinsics.

Thanks, this should now cover all intrinsics with immediate arguments. Some additional comments inline (mostly cosmetic, with one real fix for int_s390_vfisb).

Feb 19 2019, 3:00 AM

Feb 18 2019

uweigand added a comment to D58270: [SystemZ] Load all vector and FP constants in Select() .
Could we actually handle FP128 as well with a present FeatureVectorEnhancements1?
Feb 18 2019, 9:56 AM
uweigand added a comment to D58353: SystemZ: Add ImmArg to intrinsics.

I'm not quite sure I understand the logic why some intrinsics that require immediate arguments are marked with ImmArg, but others are not?

Shouldn't we mark all of them? The ones I see missing in your patch are:

defm int_s390_vfae  : SystemZTernaryIntCCBHF;
defm int_s390_vfaez : SystemZTernaryIntCCBHF;
defm int_s390_vstrc  : SystemZQuaternaryIntCCBHF;
defm int_s390_vstrcz : SystemZQuaternaryIntCCBHF;
def int_s390_vfmaxdb : Intrinsic<[llvm_v2f64_ty],
def int_s390_vfmindb : Intrinsic<[llvm_v2f64_ty],
def int_s390_vfmaxsb : Intrinsic<[llvm_v4f32_ty],
def int_s390_vfminsb : Intrinsic<[llvm_v4f32_ty],
def int_s390_vftcidb : SystemZBinaryConvIntCC<llvm_v2i64_ty, llvm_v2f64_ty>;
def int_s390_vftcisb : SystemZBinaryConvIntCC<llvm_v4i32_ty, llvm_v4f32_ty>;
def int_s390_vfidb : Intrinsic<[llvm_v2f64_ty],
def int_s390_vfisb : Intrinsic<[llvm_v4f32_ty],

Maybe it would actually make more sense to add the ImmArg directly in the SystemZ*Int* helper macros; those intrinsics really all need immediate arguments. (The sole exception I can see is verll, but that probably should use a different helper then.)

I'm doing this blindly based on the definitions here https://github.com/llvm-mirror/clang/blob/master/include/clang/Basic/BuiltinsSystemZ.def
Are these accurate and complete?

Feb 18 2019, 9:04 AM
uweigand added a comment to D58353: SystemZ: Add ImmArg to intrinsics.

I'm not quite sure I understand the logic why some intrinsics that require immediate arguments are marked with ImmArg, but others are not?

Feb 18 2019, 8:37 AM

Feb 14 2019

uweigand accepted D58240: [SystemZ] Make sure VEXTEND and VROUND nodes are not emitted without vector support..

Ah, good catch! LGTM.

Feb 14 2019, 9:36 AM

Feb 13 2019

uweigand added a comment to D58142: [SystemZ] Accept more constant FP BuildVectors..

Hmm. Actually, I'm now wondering why we need to reject anything in the first place. Can't we improve isFPImmLegal to accept *anything* that can be constructed via any of the vector instructions (VGBM, VGM, VREPI)?

Feb 13 2019, 5:44 AM

Feb 12 2019

uweigand accepted D58003: [SystemZ] Use VGM whenever possible to load FP immediates.

Yes, I think this version is better.

Feb 12 2019, 4:41 AM

Feb 11 2019

uweigand added a comment to D58003: [SystemZ] Use VGM whenever possible to load FP immediates.

It seems you're assuming VGM is always available. I guess there needs to be a check for hasVector() somewhere.

Feb 11 2019, 4:58 AM

Feb 8 2019

uweigand added a comment to D57926: [SystemZ] Wait with selection of VREPI and VGM until after DAGCombine2..

I think 1a would be the best option, indeed.

Feb 8 2019, 8:55 AM
uweigand added a comment to D57926: [SystemZ] Wait with selection of VREPI and VGM until after DAGCombine2..

Well, for replication we definitely need proper float support. For VGBM, we could ignore floats since (except for the all-zero and all-one pattern) there aren't really any common FP constants that can be created via a VGBM pattern. But that isn't true at all for replication ...

Feb 8 2019, 5:14 AM

Feb 6 2019

uweigand abandoned D23467: Generate -1/0/1 memcmp/strcmp result for z13.
Feb 6 2019, 7:14 AM
uweigand commandeered D23467: Generate -1/0/1 memcmp/strcmp result for z13.

Now fixed in a slightly different manner in as r353304.

Feb 6 2019, 7:14 AM
uweigand committed rG17a001268724: [SystemZ] Do not return INT_MIN from strcmp/memcmp (authored by uweigand).
[SystemZ] Do not return INT_MIN from strcmp/memcmp
Feb 6 2019, 7:10 AM
uweigand committed rL353304: [SystemZ] Do not return INT_MIN from strcmp/memcmp.
[SystemZ] Do not return INT_MIN from strcmp/memcmp
Feb 6 2019, 7:10 AM
uweigand accepted D57710: [SystemZ] Improve handling of @llvm.ctlz intrinsic.

LGTM, thanks!

Feb 6 2019, 2:27 AM
uweigand accepted D57152: [SystemZ] Wait with VGBM selection until after DAGCombine2..

Yes, I think this version makes most sense. LGTM.

Feb 6 2019, 2:27 AM

Jan 29 2019

uweigand accepted D57407: [DAG][SystemZ] Define unwrapAddress for PCREL_WRAPPER..

LGTM, thanks!

Jan 29 2019, 3:05 PM

Jan 28 2019

uweigand added a comment to D57152: [SystemZ] Wait with VGBM selection until after DAGCombine2..

The ExpandNode() case for ConstantFP will call TLI.isFPImmLegal() which return false (in most cases), and therefore ExpandFPConstant() is called, which returns a load from the constant pool.

Jan 28 2019, 4:59 AM

Jan 24 2019

uweigand added a comment to D57152: [SystemZ] Wait with VGBM selection until after DAGCombine2..

I am not quite sure if this is the best solution, but as it is now tryBuildVectorByteMask() is used first during legalization to build a new BUILD_VECTOR with the right constants, and then again in Select() to get the same mask back again. I first thought it would be possible to just leave the BUILD_VECTORS during legalization, but then I found a case where this doesn't work which involved ConstantFP<nan>, which ended up in the constant pool.

Jan 24 2019, 9:16 AM
uweigand added a comment to D56796: [DAGCombiner][x86] add transform/hook to vectorize: cast(extract V, Y).

Thanks for the heads-up! This may indeed be interesting for SystemZ, but I think it's still probably preferable to do it in the back-end like your alternative approach does, that will allow us to handle some special instruction selection issues we'll likely run into ...

Jan 24 2019, 9:03 AM

Jan 22 2019

uweigand accepted D57048: [SystemZ] Handle DBG_VALUE instructions in two places in backend.

LGTM, thanks!

Jan 22 2019, 7:24 AM
uweigand added a comment to D57048: [SystemZ] Handle DBG_VALUE instructions in two places in backend.

What's the reason for using SkipPHIsLabelsAndDebug instead of, say, skipDebugInstructionsForward? It's not obvious to me that skipping PHIs and labels is safe at all those places ...

Jan 22 2019, 6:09 AM

Dec 20 2018

uweigand committed rL349761: [SystemZ] "Generic" vector assembler instructions shoud clobber CC.
[SystemZ] "Generic" vector assembler instructions shoud clobber CC
Dec 20 2018, 6:28 AM
uweigand added a comment to D55916: [clang] Replace getOS() == llvm::Triple::*BSD with isOS*BSD() [NFCI].

This causes test case failures due to no longer linking with -lrt on Linux.

Dec 20 2018, 5:41 AM
uweigand added a comment to D55916: [clang] Replace getOS() == llvm::Triple::*BSD with isOS*BSD() [NFCI].

This causes test case failures due to no longer linking with -lrt on Linux.

Dec 20 2018, 5:38 AM
uweigand committed rC349753: [SystemZ] Improve testing of vecintrin.h intrinsics.
[SystemZ] Improve testing of vecintrin.h intrinsics
Dec 20 2018, 5:14 AM
uweigand committed rL349753: [SystemZ] Improve testing of vecintrin.h intrinsics.
[SystemZ] Improve testing of vecintrin.h intrinsics
Dec 20 2018, 5:14 AM
uweigand committed rC349751: [SystemZ] Fix wrong codegen caused by typos in vecintrin.h.
[SystemZ] Fix wrong codegen caused by typos in vecintrin.h
Dec 20 2018, 5:12 AM
uweigand committed rL349751: [SystemZ] Fix wrong codegen caused by typos in vecintrin.h.
[SystemZ] Fix wrong codegen caused by typos in vecintrin.h
Dec 20 2018, 5:12 AM
uweigand committed rL349749: [SystemZ] Make better use of VLLEZ.
[SystemZ] Make better use of VLLEZ
Dec 20 2018, 5:08 AM
uweigand committed rL349748: [SystemZ] Make better use of VGEF/VGEG.
[SystemZ] Make better use of VGEF/VGEG
Dec 20 2018, 5:04 AM
uweigand committed rL349746: [SystemZ] Make better use of VLDEB.
[SystemZ] Make better use of VLDEB
Dec 20 2018, 5:02 AM

Dec 18 2018

uweigand added a comment to D55722: [DAGCombiner] scalarize binop followed by extractelement.

I tried reverting my patch for the DAG type legalizer, and found that vec-trunc-to-i1.ll still fails with this new patch applied. It fails during type legalization, while this patch at least in this case gets enabled only after type legalization. So I think the test still serves its purpose.

Dec 18 2018, 2:30 AM
uweigand added inline comments to D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.
Dec 18 2018, 2:29 AM
uweigand added a comment to D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.

This looks like a promising direction. I particularly like the idea of having a way to intersect information from the backend instruction definitions with the constraints coming from the IR. However, I also have some concerns.

Dec 18 2018, 12:53 AM

Dec 17 2018

uweigand added a comment to D55722: [DAGCombiner] scalarize binop followed by extractelement.

Ah, right, I missed that.

Dec 17 2018, 9:15 AM
uweigand updated subscribers of D55722: [DAGCombiner] scalarize binop followed by extractelement.

Just looking at the SystemZ test cases:

Dec 17 2018, 4:11 AM

Dec 15 2018

uweigand added a comment to D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.

Well, mayRaise Exception is purely a MI level flag. I struggle to see where optimizations on the MI level would ever care about rounding modes in the sense you describe: note that currently, MI optimizations don't even know which operation an MI instruction performs -- if you don't even know whether you're dealing with addition or subtraction, why would you care which rounding mode the operation is performed in? MI transformations instead care about what I'd call "structural" properties of the operation: what are the operands, what is input vs. output, which memory may be accessed, which special registers may involved, which other side effects may the operation have. This is the type of knowledge you need for the types of transformations that are done on the MI level: mostly about moving instructions around, arriving at an optimal schedule, de-duplicating identical operations performed multiple times etc. (Even things like simply changing a register operand to a memory operand for the same operation cannot be done solely by common MI optimizations but require per-target support.)

Dec 15 2018, 8:28 AM

Dec 14 2018

uweigand added a comment to D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.

This patch does seem FP exception centric and rounding mode agnostic though. Should FPExcept and friends be named something more general to cover both? To be clear, I'm okay with the current naming scheme, so just playing Devil's advocate.

Dec 14 2018, 11:53 AM
uweigand updated the diff for D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.

Updated comment.

Dec 14 2018, 11:50 AM
uweigand added a comment to D55600: [TargetLowering] Add ISD::OR + ISD::XOR handling to SimplifyDemandedVectorElts.

I leave the review of the SystemZ test to Uli.

Dec 14 2018, 4:12 AM

Dec 10 2018

uweigand created D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.
Dec 10 2018, 4:33 AM

Dec 4 2018

uweigand added a comment to D54649: [FPEnv] Rough out constrained FCmp intrinsics.

The one problem with your "new" code is that it now forces back-ends to implement something, or else code involving constrained intrinsics will trigger internal compiler errors. It might be preferable to avoid those ...

Dec 4 2018, 8:36 AM · Restricted Project
uweigand added inline comments to D54649: [FPEnv] Rough out constrained FCmp intrinsics.
Dec 4 2018, 7:36 AM · Restricted Project
uweigand added a comment to D19125: Enable __float128 on X86 and SystemZ.

GCC has never supported the __float128 type on SystemZ, because "long double" is already IEEE-128 on the platform. GCC only supports a separate __float128 type on platforms where "long double" is some other type (like x86 or ppc64).

Dec 4 2018, 2:59 AM
uweigand added a comment to D55057: [Headers] Make max_align_t match GCC's implementation..

As an aside, it would be nice if we had a test case that verifies the explicit values of alignof(max_align_t) on all supported platforms. This is an ABI property that should never change.

Dec 4 2018, 2:58 AM
uweigand added inline comments to D55057: [Headers] Make max_align_t match GCC's implementation..
Dec 4 2018, 2:55 AM
uweigand committed rC348247: [SystemZ] Do not support __float128.
[SystemZ] Do not support __float128
Dec 4 2018, 2:54 AM
uweigand committed rL348247: [SystemZ] Do not support __float128.
[SystemZ] Do not support __float128
Dec 4 2018, 2:54 AM
uweigand added inline comments to D55057: [Headers] Make max_align_t match GCC's implementation..
Dec 4 2018, 2:48 AM

Dec 3 2018

uweigand added inline comments to D54649: [FPEnv] Rough out constrained FCmp intrinsics.
Dec 3 2018, 9:59 AM · Restricted Project
uweigand added inline comments to D55057: [Headers] Make max_align_t match GCC's implementation..
Dec 3 2018, 8:51 AM
uweigand added inline comments to D54649: [FPEnv] Rough out constrained FCmp intrinsics.
Dec 3 2018, 8:34 AM · Restricted Project
uweigand added a comment to D53157: Teach the IRBuilder about constrained fadd and friends.

Digressing a bit, but has anyone given thought to how this implementation will play out with libraries? When running with traps enabled, libraries must be compiled for trap-safety. E.g. optimizations on a lib's code could introduce a NaN or cause a trap that does not exist in the code itself.

Dec 3 2018, 8:06 AM
uweigand added a comment to D54649: [FPEnv] Rough out constrained FCmp intrinsics.

@uweigand, what about something like this patch? The STRICT_FSETCC isn't handled all the way through the target, but it's stubbed out for now.

Dec 3 2018, 7:57 AM · Restricted Project
uweigand accepted D55111: [SystemZ::TTI] Return zero cost for icmp in case of Load-And-Test.

LGTM, thanks!

Dec 3 2018, 5:40 AM

Nov 29 2018

uweigand added inline comments to D55057: [Headers] Make max_align_t match GCC's implementation..
Nov 29 2018, 1:52 PM
uweigand accepted D55053: [SystemZ::TTI] i8/i16 operands extension cost revisited.

LGTM, thanks!

Nov 29 2018, 12:11 PM
uweigand added a comment to D50977: [TableGen] Examine entire subreg compositions to detect ambiguity.

LGTM!

Might wait a little before you land this to see if @uweigand got anything more to say. But afaict this only removes the warning without affecting the result in any way, so it should not make anything worse.

Nov 29 2018, 8:51 AM

Nov 28 2018

uweigand updated subscribers of D54962: [SystemZ] Rework subreg structure to avoid TableGen warning.

Thanks for pointing out those debug info differences, I agree that this might be a problem.

Nov 28 2018, 9:27 AM

Nov 27 2018

uweigand accepted D54944: [SystemZ::TTI] Improve cost for compare of i64 with extended i32 load.

LGTM, thanks!

Nov 27 2018, 11:19 AM
uweigand accepted D54940: [SystemZ::TTI] Improve costs for add, sub and mul i16 against memory.

LGTM, thanks!

Nov 27 2018, 11:18 AM
uweigand accepted D54897: [SystemZ::TTI] Improved cost values for comparison against memory..

LGTM, thanks!

Nov 27 2018, 11:15 AM
uweigand accepted D54870: [SystemZ::TTI] Return zero cost for a load / store connected with a scalar bswap.

LGTM, thanks!

Nov 27 2018, 11:14 AM
uweigand added a comment to D50977: [TableGen] Examine entire subreg compositions to detect ambiguity.

If so, maybe we can fix this by swapping around the low and high subregs of all other register definitions, so that then subreg_h32 *always* maps to subreg_h32(subreg_h64)), and instead of explicit subreg_hh32 and subreg_hl32 we have rather subreg_lh32 and subreg_ll32 ?

Nov 27 2018, 11:10 AM
uweigand created D54962: [SystemZ] Rework subreg structure to avoid TableGen warning.
Nov 27 2018, 11:07 AM
uweigand added a comment to D50977: [TableGen] Examine entire subreg compositions to detect ambiguity.

I must have missed the earlier discussion, but I agree with @bjope 's comment earlier that subreg_h32(V0) -> F0S is actually wrong; there should not be any subreg_h32(V0) at all!

Nov 27 2018, 7:32 AM

Nov 21 2018

uweigand added a comment to D52785: [PseudoSourceValue] New category to represent floating-point status.

How can we get to a decision on whether or not we can get to a place where we're no longer allowed to drop memoperands?

Nov 21 2018, 9:38 AM
uweigand accepted D54789: [SystemZ/TTI] Give correct cost values for vector bswap intrinsics.

LGTM, thanks!

Nov 21 2018, 6:32 AM
uweigand added a comment to D53157: Teach the IRBuilder about constrained fadd and friends.
In D53157#1305233, @kpn wrote:

But given that there is still infrastructure missing in the IR optimizers, I also think that at least in the first implementation, we probably should go with the original approach and just use constrained intrinsics everywhere in the function, and possibly add some function attribute that prevent any cross-inlining of functions built with constrained intrinsics with functions built with regular floating-point operations.

Subtle. This last sentence seems to imply that cross-inlining should be allowed when there are no regular floating point operations in the function to be inlined. This makes sense due to, for example, the common use of tiny functions just to retrieve a value. Do I interpret your statement correctly?

Nov 21 2018, 6:16 AM

Nov 20 2018

uweigand added a comment to D53157: Teach the IRBuilder about constrained fadd and friends.

I agree that it's preferable to re-use these existing options if possible. I have some concerns that -ftrapping-math has a partial implementation in place that doesn't seem to be well aligned with the way fast-math flags are handled, so it might require some work to have that working as expected without breaking existing users. In general though these seem like they should do what we need.

Nov 20 2018, 9:27 AM
uweigand added a comment to D53157: Teach the IRBuilder about constrained fadd and friends.

OK, let me try to expand on my point 3 above, which appears to have confused everybody :-)

Nov 20 2018, 8:50 AM
uweigand added a comment to D54355: Use is.constant intrinsic for __builtin_constant_p.

It seems this patch caused the SystemZ build bots to fail, they're now all running into assertion failures:

Nov 20 2018, 6:02 AM

Nov 19 2018

uweigand added a comment to D53794: [TargetLowering] expandFP_TO_UINT - avoid FPE due to out of range conversion (PR17686).

As I understand the approach, it can never introduce a new trap. Now of course, if the original source value is outside the range of the target integer type, some of the instructions introduced by this approach may trap, but in that case, it is expected for the fp_to_uint operation to trap somewhere.

Nov 19 2018, 5:11 AM
uweigand added a comment to D54649: [FPEnv] Rough out constrained FCmp intrinsics.

In general, I think going forward the back-ends should be explicitly aware of the STRICT_ FP nodes. Anyway, that's the direction my proposed SystemZ patch takes: https://reviews.llvm.org/D45576

Nov 19 2018, 4:59 AM · Restricted Project
uweigand added a comment to D53157: Teach the IRBuilder about constrained fadd and friends.

A couple of comments on the previous discussion:

Nov 19 2018, 4:08 AM

Nov 12 2018

uweigand accepted D54264: [SystemZ] Increase the number of VLREPs.

LGTM, thanks.

Nov 12 2018, 8:49 AM
uweigand added inline comments to D54264: [SystemZ] Increase the number of VLREPs.
Nov 12 2018, 8:03 AM
uweigand accepted D54423: [SystemZ::TTI] Improve accuracy of costs for vector fp <-> int conversions.

LGTM, thanks!

Nov 12 2018, 6:41 AM
uweigand added a comment to D54264: [SystemZ] Increase the number of VLREPs.

This seems reasonable in principle to me now, but see inline comments.

Nov 12 2018, 6:32 AM

Nov 9 2018

uweigand updated the diff for D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Re-added new tests accidentally lost in last diff.

Nov 9 2018, 12:07 PM
uweigand updated the diff for D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Updated to include support for recently added constrained intrinsics: floor, ceil, trunc, round, minnum, maxnum.

Nov 9 2018, 12:03 PM
uweigand committed rL346541: [SystemZ] Add a couple of missing tests.
[SystemZ] Add a couple of missing tests
Nov 9 2018, 11:18 AM
uweigand accepted D54322: [SystemZ] Replicate the load with most uses in buildVector().

LGTM, thanks!

Nov 9 2018, 9:38 AM
uweigand accepted D54315: [SystemZ] Avoid inserting same value after replication.

As a future enhancement, we might prefer to choose the initial value to replicate such that the number of VLVGx instruction is minimized. E.g. if we load two integers LA and LB from memory and construct the vector { LA, LB, LB, LB }, the current code, even with your change, would load and replicate LA, and then use three VLVGx to insert the LB copies. We could save two instructions by instead using load-and-replicate for LB.

Nov 9 2018, 7:37 AM
uweigand added a comment to D54264: [SystemZ] Increase the number of VLREPs.

I'm not sure that this is always a win. In fact, this may depend on the type of the loaded value.

Nov 9 2018, 4:14 AM

Nov 7 2018

uweigand accepted D54197: [SystemZ] Bugfix in shouldCoalesce().

LGTM, thanks!

Nov 7 2018, 2:27 PM

Nov 2 2018

uweigand accepted D54028: [SystemZ::TTI] Let i8/i16 uint/sint to fp conversions cost 1 if operand is a load..

LGTM, thanks.

Nov 2 2018, 6:19 AM
uweigand accepted D53071: [SystemZ] Rework getInterleavedMemoryOpCost().

LGTM, thanks.

Nov 2 2018, 6:18 AM

Oct 31 2018

uweigand accepted D53923: [SystemZ] Accurate costs for i1->double vector conversions.

LGTM, thanks.

Oct 31 2018, 5:26 AM