This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ] README: remove an implemented idea, add some new ones.
ClosedPublic

Authored by • koriakin on Apr 11 2016, 6:07 AM.

Download Raw Diff

Details

Reviewers

Commits

rGaa0476860009: [SystemZ] README: remove an implemented idea, add some new ones
rL265944: [SystemZ] README: remove an implemented idea, add some new ones

Summary

The note about conditional returns can now be removed, as they are
implemented. Let's also add 3 new ones in exchange.

Diff Detail

Event Timeline

• koriakin updated this revision to Diff 53230.Apr 11 2016, 6:07 AM

• koriakin retitled this revision from to [SystemZ] README: remove an implemented idea, add some new ones..

• koriakin updated this object.

• koriakin added a reviewer: uweigand.

• koriakin set the repository for this revision to rL LLVM.

• koriakin added a subscriber: llvm-commits.

Agreed on the trap instruction.

As to overflow, there is already this text in README.txt:

ADD LOGICAL WITH SIGNED IMMEDIATE could be useful when we need to
produce a carry. SUBTRACT LOGICAL IMMEDIATE could be useful when we
need to produce a borrow. (Note that there are no memory forms of
ADD LOGICAL WITH CARRY and SUBTRACT LOGICAL WITH BORROW, so the high
part of 128-bit memory operations would probably need to be done
via a register.)

Does this cover what you're refering to? In any case, this should probably be merged there.

As to SRDL etc., I think you're refering to 128-bit shifts? These are just one instance of a more general problem: the back-end currently does not handle i128 *at all*, it is marked as illegal type. At some point, we should probably make i128 legal, and add optimal code gen for all the operations on that type, including shifts, but also the rest of them. (In particular, on z13 we should also use vector instructions e.g. for 128-bit add, subtract, shift.) This would also be a pre-req for implementing the 16-byte atomics that are mentioned a couple of lines earlier in README.txt.

In D18962#396965, @uweigand wrote:

Agreed on the trap instruction.

As to overflow, there is already this text in README.txt:

ADD LOGICAL WITH SIGNED IMMEDIATE could be useful when we need to
produce a carry. SUBTRACT LOGICAL IMMEDIATE could be useful when we
need to produce a borrow. (Note that there are no memory forms of
ADD LOGICAL WITH CARRY and SUBTRACT LOGICAL WITH BORROW, so the high
part of 128-bit memory operations would probably need to be done
via a register.)

Does this cover what you're refering to? In any case, this should probably be merged there.

No - this is about carry, not overflow. By overflow I mean signed overflow used eg. by -ftrapv:

int f(int a, int b) {
        return a + b;
}

Compiles with -ftrapv to:

define signext i32 @f(i32 signext %a, i32 signext %b) #0 {
entry:
  %0 = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
  %1 = extractvalue { i32, i1 } %0, 1
  br i1 %1, label %trap, label %cont

trap:                                             ; preds = %entry
  tail call void @llvm.trap() #2
  unreachable

cont:                                             ; preds = %entry
  %2 = extractvalue { i32, i1 } %0, 0
  ret i32 %2
}

Which compiles to this monstrosity:

f:                                      # @f
# BB#0:                                 # %entry
        stmg    %r14, %r15, 112(%r15)
        aghi    %r15, -160
        chi     %r3, 0
        ipm     %r0
        xilf    %r0, 4294967295
        risbg   %r0, %r0, 63, 191, 36
        chi     %r2, 0
        ipm     %r1
        xilf    %r1, 4294967295
        risbg   %r1, %r1, 63, 191, 36
        cr      %r1, %r0
        ipm     %r0
        afi     %r0, -268435456
        ar      %r2, %r3
        chi     %r2, 0
        ipm     %r3
        xilf    %r3, 4294967295
        risbg   %r3, %r3, 63, 191, 36
        cr      %r1, %r3
        ipm     %r1
        afi     %r1, 1879048192
        nr      %r1, %r0
        srl     %r1, 31
        cije    %r1, 1, .LBB0_2
# BB#1:                                 # %cont
        lgfr    %r2, %r2
        lmg     %r14, %r15, 272(%r15)
        br      %r14
.LBB0_2:                                # %trap
        brasl   %r14, abort@PLT
.Lfunc_end0:

While it could be like this:

f:
    stmg    %r14, %r15, 112(%r15)
    ar      %r2,%r3
    jo       .Lfail
    lgfr    %r2,%r2
    lmg     %r14, %r15, 272(%r15)
    br      %r14
.Lfail:
    brasl   %r14, abort@PLT

Combined with trap support, we could get that down to 4 instructions (ar, jo .+2, lgfr, br).

As to SRDL etc., I think you're refering to 128-bit shifts? These are just one instance of a more general problem: the back-end currently does not handle i128 *at all*, it is marked as illegal type. At some point, we should probably make i128 legal, and add optimal code gen for all the operations on that type, including shifts, but also the rest of them. (In particular, on z13 we should also use vector instructions e.g. for 128-bit add, subtract, shift.) This would also be a pre-req for implementing the 16-byte atomics that are mentioned a couple of lines earlier in README.txt.

Oops, nevermind... I've just seen the instructions are actually 32-bit only (and the double refers to 64-bit), so rather useless on a 64-bit target. I'll remove this entry.

Ah OK, the overflow entry does indeed make sense then.

Removed the shifts idea, clarified the signed overflow one.

LGTM, I'll check it in.

This revision is now accepted and ready to land.Apr 11 2016, 7:42 AM

Closed by commit rL265944: [SystemZ] README: remove an implemented idea, add some new ones (authored by uweigand). · Explain WhyApr 11 2016, 7:44 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Target/

SystemZ/

README.txt

21 lines

Diff 53230

lib/Target/SystemZ/README.txt

	Show All 37 Lines
	There is no scheduling support.			There is no scheduling support.

	--			--

	We don't use the BRANCH ON INDEX instructions.			We don't use the BRANCH ON INDEX instructions.

	--			--

	We might want to use BRANCH ON CONDITION for conditional indirect calls
	and conditional returns.

	--

	We don't use the TEST DATA CLASS instructions.			We don't use the TEST DATA CLASS instructions.

	--			--

	We only use MVC, XC and CLC for constant-length block operations.			We only use MVC, XC and CLC for constant-length block operations.
	We could extend them to variable-length operations too,			We could extend them to variable-length operations too,
	using EXECUTE RELATIVE LONG.			using EXECUTE RELATIVE LONG.

	▲ Show 20 Lines • Show All 102 Lines • ▼ Show 20 Lines
	--			--

	If needed, we can support 16-byte atomics using LPQ, STPQ and CSDG.			If needed, we can support 16-byte atomics using LPQ, STPQ and CSDG.

	--			--

	We might want to model all access registers and use them to spill			We might want to model all access registers and use them to spill
	32-bit values.			32-bit values.

				--

				We might want to use 'j .+2' as a trap instruction, like gcc does. It can
				also be made conditional like the return instruction, allowing us to utilize
				compare-and-trap and load-and-trap instructions.

				--

				We might want to use the 'overflow' condition to support
				llvm.sadd.with.overflow.i32 and related instructions - the generated code
				is currently quite bad.

				--

				SRDL/SRDA/SLDL could be modeled and used for multi-precision shifts.