Page MenuHomePhabricator

uweigand (Ulrich Weigand)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 14 2013, 11:48 AM (296 w, 2 d)

Recent Activity

Yesterday

uweigand added a comment to D55722: [DAGCombiner] scalarize binop followed by extractelement.

I tried reverting my patch for the DAG type legalizer, and found that vec-trunc-to-i1.ll still fails with this new patch applied. It fails during type legalization, while this patch at least in this case gets enabled only after type legalization. So I think the test still serves its purpose.

Tue, Dec 18, 2:30 AM
uweigand added inline comments to D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.
Tue, Dec 18, 2:29 AM
uweigand added a comment to D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.

This looks like a promising direction. I particularly like the idea of having a way to intersect information from the backend instruction definitions with the constraints coming from the IR. However, I also have some concerns.

Tue, Dec 18, 12:53 AM

Mon, Dec 17

uweigand added a comment to D55722: [DAGCombiner] scalarize binop followed by extractelement.

Ah, right, I missed that.

Mon, Dec 17, 9:15 AM
uweigand updated subscribers of D55722: [DAGCombiner] scalarize binop followed by extractelement.

Just looking at the SystemZ test cases:

Mon, Dec 17, 4:11 AM

Sat, Dec 15

uweigand added a comment to D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.

Well, mayRaise Exception is purely a MI level flag. I struggle to see where optimizations on the MI level would ever care about rounding modes in the sense you describe: note that currently, MI optimizations don't even know which operation an MI instruction performs -- if you don't even know whether you're dealing with addition or subtraction, why would you care which rounding mode the operation is performed in? MI transformations instead care about what I'd call "structural" properties of the operation: what are the operands, what is input vs. output, which memory may be accessed, which special registers may involved, which other side effects may the operation have. This is the type of knowledge you need for the types of transformations that are done on the MI level: mostly about moving instructions around, arriving at an optimal schedule, de-duplicating identical operations performed multiple times etc. (Even things like simply changing a register operand to a memory operand for the same operation cannot be done solely by common MI optimizations but require per-target support.)

Sat, Dec 15, 8:28 AM

Fri, Dec 14

uweigand added a comment to D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.

This patch does seem FP exception centric and rounding mode agnostic though. Should FPExcept and friends be named something more general to cover both? To be clear, I'm okay with the current naming scheme, so just playing Devil's advocate.

Fri, Dec 14, 11:53 AM
uweigand updated the diff for D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.

Updated comment.

Fri, Dec 14, 11:50 AM
uweigand added a comment to D55600: [TargetLowering] Add ISD::OR + ISD::XOR handling to SimplifyDemandedVectorElts.

I leave the review of the SystemZ test to Uli.

Fri, Dec 14, 4:12 AM

Mon, Dec 10

uweigand created D55506: [RFC v2] Allow target to handle STRICT floating-point nodes.
Mon, Dec 10, 4:33 AM

Tue, Dec 4

uweigand added a comment to D54649: [FPEnv] Rough out constrained FCmp intrinsics.

The one problem with your "new" code is that it now forces back-ends to implement something, or else code involving constrained intrinsics will trigger internal compiler errors. It might be preferable to avoid those ...

Tue, Dec 4, 8:36 AM
uweigand added inline comments to D54649: [FPEnv] Rough out constrained FCmp intrinsics.
Tue, Dec 4, 7:36 AM
uweigand added a comment to D19125: Enable __float128 on X86 and SystemZ.

GCC has never supported the __float128 type on SystemZ, because "long double" is already IEEE-128 on the platform. GCC only supports a separate __float128 type on platforms where "long double" is some other type (like x86 or ppc64).

Tue, Dec 4, 2:59 AM
uweigand added a comment to D55057: [Headers] Make max_align_t match GCC's implementation..

As an aside, it would be nice if we had a test case that verifies the explicit values of alignof(max_align_t) on all supported platforms. This is an ABI property that should never change.

Tue, Dec 4, 2:58 AM
uweigand added inline comments to D55057: [Headers] Make max_align_t match GCC's implementation..
Tue, Dec 4, 2:55 AM
uweigand committed rC348247: [SystemZ] Do not support __float128.
[SystemZ] Do not support __float128
Tue, Dec 4, 2:54 AM
uweigand committed rL348247: [SystemZ] Do not support __float128.
[SystemZ] Do not support __float128
Tue, Dec 4, 2:54 AM
uweigand added inline comments to D55057: [Headers] Make max_align_t match GCC's implementation..
Tue, Dec 4, 2:48 AM

Mon, Dec 3

uweigand added inline comments to D54649: [FPEnv] Rough out constrained FCmp intrinsics.
Mon, Dec 3, 9:59 AM
uweigand added inline comments to D55057: [Headers] Make max_align_t match GCC's implementation..
Mon, Dec 3, 8:51 AM
uweigand added inline comments to D54649: [FPEnv] Rough out constrained FCmp intrinsics.
Mon, Dec 3, 8:34 AM
uweigand added a comment to D53157: Teach the IRBuilder about constrained fadd and friends.

Digressing a bit, but has anyone given thought to how this implementation will play out with libraries? When running with traps enabled, libraries must be compiled for trap-safety. E.g. optimizations on a lib's code could introduce a NaN or cause a trap that does not exist in the code itself.

Mon, Dec 3, 8:06 AM
uweigand added a comment to D54649: [FPEnv] Rough out constrained FCmp intrinsics.

@uweigand, what about something like this patch? The STRICT_FSETCC isn't handled all the way through the target, but it's stubbed out for now.

Mon, Dec 3, 7:57 AM
uweigand accepted D55111: [SystemZ::TTI] Return zero cost for icmp in case of Load-And-Test.

LGTM, thanks!

Mon, Dec 3, 5:40 AM

Thu, Nov 29

uweigand added inline comments to D55057: [Headers] Make max_align_t match GCC's implementation..
Thu, Nov 29, 1:52 PM
uweigand accepted D55053: [SystemZ::TTI] i8/i16 operands extension cost revisited.

LGTM, thanks!

Thu, Nov 29, 12:11 PM
uweigand added a comment to D50977: [TableGen] Examine entire subreg compositions to detect ambiguity.

LGTM!

Might wait a little before you land this to see if @uweigand got anything more to say. But afaict this only removes the warning without affecting the result in any way, so it should not make anything worse.

Thu, Nov 29, 8:51 AM

Wed, Nov 28

uweigand updated subscribers of D54962: [SystemZ] Rework subreg structure to avoid TableGen warning.

Thanks for pointing out those debug info differences, I agree that this might be a problem.

Wed, Nov 28, 9:27 AM

Tue, Nov 27

uweigand accepted D54944: [SystemZ::TTI] Improve cost for compare of i64 with extended i32 load.

LGTM, thanks!

Tue, Nov 27, 11:19 AM
uweigand accepted D54940: [SystemZ::TTI] Improve costs for add, sub and mul i16 against memory.

LGTM, thanks!

Tue, Nov 27, 11:18 AM
uweigand accepted D54897: [SystemZ::TTI] Improved cost values for comparison against memory..

LGTM, thanks!

Tue, Nov 27, 11:15 AM
uweigand accepted D54870: [SystemZ::TTI] Return zero cost for a load / store connected with a scalar bswap.

LGTM, thanks!

Tue, Nov 27, 11:14 AM
uweigand added a comment to D50977: [TableGen] Examine entire subreg compositions to detect ambiguity.

If so, maybe we can fix this by swapping around the low and high subregs of all other register definitions, so that then subreg_h32 *always* maps to subreg_h32(subreg_h64)), and instead of explicit subreg_hh32 and subreg_hl32 we have rather subreg_lh32 and subreg_ll32 ?

Tue, Nov 27, 11:10 AM
uweigand created D54962: [SystemZ] Rework subreg structure to avoid TableGen warning.
Tue, Nov 27, 11:07 AM
uweigand added a comment to D50977: [TableGen] Examine entire subreg compositions to detect ambiguity.

I must have missed the earlier discussion, but I agree with @bjope 's comment earlier that subreg_h32(V0) -> F0S is actually wrong; there should not be any subreg_h32(V0) at all!

Tue, Nov 27, 7:32 AM

Wed, Nov 21

uweigand added a comment to D52785: [PseudoSourceValue] New category to represent floating-point status.

How can we get to a decision on whether or not we can get to a place where we're no longer allowed to drop memoperands?

Wed, Nov 21, 9:38 AM
uweigand accepted D54789: [SystemZ/TTI] Give correct cost values for vector bswap intrinsics.

LGTM, thanks!

Wed, Nov 21, 6:32 AM
uweigand added a comment to D53157: Teach the IRBuilder about constrained fadd and friends.
In D53157#1305233, @kpn wrote:

But given that there is still infrastructure missing in the IR optimizers, I also think that at least in the first implementation, we probably should go with the original approach and just use constrained intrinsics everywhere in the function, and possibly add some function attribute that prevent any cross-inlining of functions built with constrained intrinsics with functions built with regular floating-point operations.

Subtle. This last sentence seems to imply that cross-inlining should be allowed when there are no regular floating point operations in the function to be inlined. This makes sense due to, for example, the common use of tiny functions just to retrieve a value. Do I interpret your statement correctly?

Wed, Nov 21, 6:16 AM

Tue, Nov 20

uweigand added a comment to D53157: Teach the IRBuilder about constrained fadd and friends.

I agree that it's preferable to re-use these existing options if possible. I have some concerns that -ftrapping-math has a partial implementation in place that doesn't seem to be well aligned with the way fast-math flags are handled, so it might require some work to have that working as expected without breaking existing users. In general though these seem like they should do what we need.

Tue, Nov 20, 9:27 AM
uweigand added a comment to D53157: Teach the IRBuilder about constrained fadd and friends.

OK, let me try to expand on my point 3 above, which appears to have confused everybody :-)

Tue, Nov 20, 8:50 AM
uweigand added a comment to D54355: Use is.constant intrinsic for __builtin_constant_p.

It seems this patch caused the SystemZ build bots to fail, they're now all running into assertion failures:

Tue, Nov 20, 6:02 AM

Mon, Nov 19

uweigand added a comment to D53794: [TargetLowering] expandFP_TO_UINT - avoid FPE due to out of range conversion (PR17686).

As I understand the approach, it can never introduce a new trap. Now of course, if the original source value is outside the range of the target integer type, some of the instructions introduced by this approach may trap, but in that case, it is expected for the fp_to_uint operation to trap somewhere.

Mon, Nov 19, 5:11 AM
uweigand added a comment to D54649: [FPEnv] Rough out constrained FCmp intrinsics.

In general, I think going forward the back-ends should be explicitly aware of the STRICT_ FP nodes. Anyway, that's the direction my proposed SystemZ patch takes: https://reviews.llvm.org/D45576

Mon, Nov 19, 4:59 AM
uweigand added a comment to D53157: Teach the IRBuilder about constrained fadd and friends.

A couple of comments on the previous discussion:

Mon, Nov 19, 4:08 AM

Nov 12 2018

uweigand accepted D54264: [SystemZ] Increase the number of VLREPs.

LGTM, thanks.

Nov 12 2018, 8:49 AM
uweigand added inline comments to D54264: [SystemZ] Increase the number of VLREPs.
Nov 12 2018, 8:03 AM
uweigand accepted D54423: [SystemZ::TTI] Improve accuracy of costs for vector fp <-> int conversions.

LGTM, thanks!

Nov 12 2018, 6:41 AM
uweigand added a comment to D54264: [SystemZ] Increase the number of VLREPs.

This seems reasonable in principle to me now, but see inline comments.

Nov 12 2018, 6:32 AM

Nov 9 2018

uweigand updated the diff for D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Re-added new tests accidentally lost in last diff.

Nov 9 2018, 12:07 PM
uweigand updated the diff for D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Updated to include support for recently added constrained intrinsics: floor, ceil, trunc, round, minnum, maxnum.

Nov 9 2018, 12:03 PM
uweigand committed rL346541: [SystemZ] Add a couple of missing tests.
[SystemZ] Add a couple of missing tests
Nov 9 2018, 11:18 AM
uweigand accepted D54322: [SystemZ] Replicate the load with most uses in buildVector().

LGTM, thanks!

Nov 9 2018, 9:38 AM
uweigand accepted D54315: [SystemZ] Avoid inserting same value after replication.

As a future enhancement, we might prefer to choose the initial value to replicate such that the number of VLVGx instruction is minimized. E.g. if we load two integers LA and LB from memory and construct the vector { LA, LB, LB, LB }, the current code, even with your change, would load and replicate LA, and then use three VLVGx to insert the LB copies. We could save two instructions by instead using load-and-replicate for LB.

Nov 9 2018, 7:37 AM
uweigand added a comment to D54264: [SystemZ] Increase the number of VLREPs.

I'm not sure that this is always a win. In fact, this may depend on the type of the loaded value.

Nov 9 2018, 4:14 AM

Nov 7 2018

uweigand accepted D54197: [SystemZ] Bugfix in shouldCoalesce().

LGTM, thanks!

Nov 7 2018, 2:27 PM

Nov 2 2018

uweigand accepted D54028: [SystemZ::TTI] Let i8/i16 uint/sint to fp conversions cost 1 if operand is a load..

LGTM, thanks.

Nov 2 2018, 6:19 AM
uweigand accepted D53071: [SystemZ] Rework getInterleavedMemoryOpCost().

LGTM, thanks.

Nov 2 2018, 6:18 AM

Oct 31 2018

uweigand accepted D53923: [SystemZ] Accurate costs for i1->double vector conversions.

LGTM, thanks.

Oct 31 2018, 5:26 AM
uweigand accepted D53924: [SystemZ] Recognize the higher cost of scalar i1 -> fp conversion.

LGTM, thanks.

Oct 31 2018, 5:26 AM

Oct 30 2018

uweigand committed rL345618: [SystemZ] Simplify LRV/STRV ISD nodes.
[SystemZ] Simplify LRV/STRV ISD nodes
Oct 30 2018, 11:24 AM
uweigand added inline comments to D52417: [LoopVectorizer] Don't pass the instruction pointer from getMemInstScalarizationCost..
Oct 30 2018, 6:31 AM

Oct 29 2018

uweigand accepted D53791: [SystemZ] Improve isFoldableLoad() for Sub, SDiv and UDiv..

LGTM, thanks!

Oct 29 2018, 10:00 AM
uweigand updated subscribers of D53794: [TargetLowering] expandFP_TO_UINT - avoid FPE due to out of range conversion (PR17686).

So I had thought that normal LLVM floating-point IR operations are defined to assume that FP exceptions are disabled, and therefore transformations do not need to preserve the FP exception status.

Oct 29 2018, 7:42 AM

Oct 19 2018

uweigand added a comment to D53411: [FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsics.
In D53411#1269133, @kpn wrote:

While you're at it, shouldn't we also add ROUND and TRUNC to complete the set?

Those are in D43515 which I'm still working on.

Oct 19 2018, 6:07 AM
uweigand added a comment to D53411: [FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsics.

While you're at it, shouldn't we also add ROUND and TRUNC to complete the set?

Oct 19 2018, 4:48 AM
uweigand added a comment to D53328: [SystemZ] Implement SystemZOperand::print().

See inline comment. Otherwise LGTM.

Oct 19 2018, 3:07 AM

Oct 18 2018

uweigand added a comment to D53373: [CodeMetrics] Don't let extends of i1 be free..

It seems this is a deliberate choice made here:

Oct 18 2018, 8:54 AM

Oct 17 2018

uweigand added a comment to D53333: [SystemZ] Make sure that SystemZAddressingMode::dump() does not crash..

The SystemZ parts LGTM.

Oct 17 2018, 6:38 AM
uweigand added a comment to D53328: [SystemZ] Implement SystemZOperand::print().

I tried to print register names but did not find a trivial way to get the MachineRegisterInfo. Now register integer values are printed instead.

Oct 17 2018, 6:35 AM
uweigand accepted D52692: [SystemZ] Experimental: Improve getMemoryOpCost() to find foldable loads that are converted..

LGTM, thanks!

Oct 17 2018, 6:26 AM
uweigand added a comment to D53196: [SystemZ] Improve handling and cost estimates of vector integer div/rem.

LGTM, thanks!

Oct 17 2018, 6:22 AM

Oct 16 2018

uweigand committed rL344611: [SystemZ] Actually enable -mzvector keywords.
[SystemZ] Actually enable -mzvector keywords
Oct 16 2018, 7:59 AM
uweigand committed rC344611: [SystemZ] Actually enable -mzvector keywords.
[SystemZ] Actually enable -mzvector keywords
Oct 16 2018, 7:59 AM

Oct 15 2018

uweigand added a comment to D52840: Include Python binding tests in CMake rules.

The first one seems to indicate that your libclang.so is broken in release mode (optimization error?). The second one is correct (some of the tests test for errors, and apparently don't silence the messages).

Oct 15 2018, 11:39 AM
uweigand added a comment to D52840: Include Python binding tests in CMake rules.

This causes check-all to abort for me on SystemZ in Release mode (and never actually run the lit tests):

Oct 15 2018, 10:22 AM

Oct 10 2018

uweigand added a comment to D53071: [SystemZ] Rework getInterleavedMemoryOpCost().

It would be good to have some tests for this ...

Oct 10 2018, 11:01 AM

Oct 9 2018

uweigand added a comment to D52692: [SystemZ] Experimental: Improve getMemoryOpCost() to find foldable loads that are converted..

This does look like a nice refactoring, so it might be worthwhile getting it in anyway, even if there's not much change in codegen ...

Oct 9 2018, 10:43 AM
uweigand added a comment to D52685: [LoopVectorizer] Adjust heuristics for a truncated load.

I agree it does make sense to me to consider VF 4 in this test case, and the resulting code is better on Z.

Oct 9 2018, 10:40 AM

Oct 4 2018

uweigand added a comment to D52785: [PseudoSourceValue] New category to represent floating-point status.

OK, I understand the concern better now, thanks.

Oct 4 2018, 9:33 AM
uweigand added a comment to D52785: [PseudoSourceValue] New category to represent floating-point status.

How is this going to work? I don't think this can work unless we can start guaranteeing that MMOs are never dropped. Most FP instructions aren't considered as mayLoad or mayStore, and will probably easily lose the MMO if it's the only thing indicating they care about the FP mode

Oct 4 2018, 4:47 AM

Oct 2 2018

uweigand updated the diff for D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Yet another Ping after returning from vacation ...

Oct 2 2018, 10:01 AM
uweigand created D52786: [MI] New flag mayAccessMemory.
Oct 2 2018, 9:56 AM
uweigand created D52785: [PseudoSourceValue] New category to represent floating-point status.
Oct 2 2018, 9:55 AM
uweigand added a comment to D52764: [Intrinsic] Add llvm.minimum and llvm.maximum instrinsic functions.

I agree that this seems fine for SystemZ.

Oct 2 2018, 7:52 AM

Sep 27 2018

uweigand added a comment to D52351: [LoopVectorizer] Fix in getScalarizationOverhead().

LGTM as well.

Sep 27 2018, 6:50 AM

Sep 14 2018

uweigand committed rCRT342236: [asan] Fix test case failure on SystemZ.
[asan] Fix test case failure on SystemZ
Sep 14 2018, 6:38 AM
uweigand committed rL342236: [asan] Fix test case failure on SystemZ.
[asan] Fix test case failure on SystemZ
Sep 14 2018, 6:38 AM

Sep 12 2018

uweigand accepted D51339: [SystemZ] Adjust cost functions for targets that use LI + LOC instead of IPM.

I agree, we need to update the cost model. Patch LGTM.

Sep 12 2018, 6:29 AM

Aug 18 2018

uweigand added a comment to D50913: [FPEnv] Don't need copysign/fabs/fneg constrained intrinsics.

I do see that you wrote -0-x. Is there something special about the -0?

Aug 18 2018, 7:09 PM

Aug 17 2018

uweigand added a comment to D50913: [FPEnv] Don't need copysign/fabs/fneg constrained intrinsics.

I believe we still don't need an intrinsic for constrained fneg. As you say, floating-point negation is implemented as -0-x, using a regular fsub (*not* the constrained intrinsic). The fsub semantics says that it cannot trap, so this would still be fine -- as long as we're sure the -0-x is always implemented via a dedicated negate instruction, and never via a subtraction. But this seems to be true, the idiom -0-x is recognized early during codegen and always treated specially.

Aug 17 2018, 2:02 PM

Aug 15 2018

uweigand accepted D50779: [SystemZ] New CL option to enable subreg liveness.

LGTM.

Aug 15 2018, 7:59 AM
uweigand accepted D50725: [SystemZ] Replace subreg_r with subreg_h.

Ah, that's good to know! So if I understand this correctly, accessing even the 32-bit part (F0S) would be considered to clobber F0D. This is not really necessary, but probably doesn't hurt at this point. We could do the same thing as the HAX you mention to get this modeled exactly.

Aug 15 2018, 7:58 AM
uweigand added a comment to D50725: [SystemZ] Replace subreg_r with subreg_h.

Well, the original rationale for using different subreg indices for float/vector registers is given here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150504/274358.html

Aug 15 2018, 6:27 AM

Aug 10 2018

uweigand added a comment to D50546: CMake: Fix native arch selection on s390 (32-bit).

Why? The SystemZ back-end doesn't support 32-bit code generation anyway, so there is no native LLVM support on s390 ...

Aug 10 2018, 2:05 AM

Aug 9 2018

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Ping. Would this be easier to review if I split it up into multiple patches?

Aug 9 2018, 10:19 AM
uweigand added a comment to D50514: [7.0 branch] Update release notes (SystemZ, TableGen).

I've checked it in now, thanks!

Aug 9 2018, 9:19 AM
uweigand committed rL339355: [7.0 branch] Update release notes (SystemZ, TableGen).
[7.0 branch] Update release notes (SystemZ, TableGen)
Aug 9 2018, 9:18 AM
uweigand closed D50514: [7.0 branch] Update release notes (SystemZ, TableGen).
Aug 9 2018, 9:18 AM
uweigand created D50514: [7.0 branch] Update release notes (SystemZ, TableGen).
Aug 9 2018, 8:27 AM

Aug 7 2018

uweigand accepted D50358: [SelectionDAG][X86][SystemZ] Add a generic nonvolatile_store/nonvolatile_load pattern fragment in TargetSelectionDAG.td.

LGTM as well.

Aug 7 2018, 4:27 AM