This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/PowerPC/
-
Target/
-
PowerPC/
9
PPC.td
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
-
feature-isav30.ll

Differential D67981

[NFC][PowerPC] Adding FeatureFPU in the definition of FeatureISA3_0
AbandonedPublic

Authored by amyk on Sep 24 2019, 1:49 PM.

Download Raw Diff

Details

Reviewers

power-llvm-team
nemanjai
echristo
hfinkel

Group Reviewers

Restricted Project

Summary

This is patch aims to add FeatureFPU into the definition of the feature, FeatureISA3_0.

There is no need to specify both predicates, IsISA3_0 and HasFPU, as a compliant implementation of ISA 3.0 should have the floating point unit.

Diff Detail

Event Timeline

amyk created this revision.Sep 24 2019, 1:49 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 24 2019, 1:49 PM

Herald added subscribers: llvm-commits, shchenz, jsji and 3 others. · View Herald Transcript

jsji added a reviewer: Restricted Project.Sep 24 2019, 3:10 PM

Herald added a subscriber: • wuzish. · View Herald TranscriptSep 24 2019, 3:10 PM

jsji added inline comments.Sep 27 2019, 2:05 PM

llvm/lib/Target/PowerPC/PPC.td
193	Looks like there is some misunderstanding of what `FeatureISA3_0` is for? As in the description, `FeatureISA3_0` is for new instruction added in ISA 3.0, not all the instructions in ISA 3.0. Why it need to depend on FeatureFPU? In other words, with the new patch `-mattr=-fpu` will also remove `isa-v30-instructions` effectively. So that means, if you add -mattr=-fpu, you can't use any of the new instruction added for P8? Is that what you intended?

jsji added inline comments.Sep 27 2019, 2:57 PM

llvm/lib/Target/PowerPC/PPC.td
193	I meant `P9`.

nemanjai added inline comments.Sep 28 2019, 1:39 PM

llvm/lib/Target/PowerPC/PPC.td
193	Yes. That is very much part of the intent here. It is not possible to have a CPU that conforms to ISA 3.0 that does not support HW floating point operations (i.e. doesn't have a HW FPU). Having an FPU does not imply we have an implementation of ISA 3.0. But having an implementation of ISA 3.0 absolutely implies having an FPU.

jsji added inline comments.Sep 30 2019, 8:04 AM

llvm/lib/Target/PowerPC/PPC.td
193	Yes, you are right, and I also agree that `having an implementation of ISA 3.0 absolutely implies having an FPU`. However, looks like to me that `FeatureISA3_0` is NOT corresponding to `compliant implementation of ISA 3.0`. But rather controlling whether we can generate new instructions in ISA 3.0. Also, `FeatureFPU` is also not corresponding to `having an FPU`, but rather controlling whether we can generates FPU instructions. So if we add the Implies to `FPU` here, we are more or less saying: all ISA 3.0 instructions are also FPU instructions, we can only generate ISA 3.0 instructions if we are allow to generate FPU instructions. Does this dependency really exists? What if someone would like to avoid generating FPU instructions (eg: kernel users), but still would like to take advantage of P9 new instructions like `extswsli`, `maddld`,`modud`. I believe he/she can use `-mcpu=pwr9 -mattr=-fpu` right now, but will get performance deg if we apply this patch. What do you think?

nemanjai added inline comments.Sep 30 2019, 9:13 PM

llvm/lib/Target/PowerPC/PPC.td
193	If we are aware of a valid use case for something like this (i.e. we want instructions that were added in ISA 3.0) but we want a guarantee that no floating point instructions are emitted, we need to do a bit more work than just stop this patch from going in. The reason we added this feature is to differentiate SPE codegen from codegen for CPUs that have an FPU. That is about all it is good for. As an example, the following will produce FP instructions: define dso_local signext i32 @test(double %a, double %b) local_unnamed_addr #0 { entry: %cmp = fcmp olt double %a, %b %conv = zext i1 %cmp to i32 ret i32 %conv } llc -mattr=-fpu < a.ll \| grep fcmpu fcmpu 0, 1, 2 So the addition of this implication is really to avoid having to add the predicate for every pattern we want to add going forward.

jsji added inline comments.Oct 1 2019, 3:13 PM

llvm/lib/Target/PowerPC/PPC.td

193

If we are aware of a valid use case for something like this (i.e. we want instructions that were added in ISA 3.0) but we want a guarantee that no floating point instructions are emitted

I believe this is an example usage in Linux Kernel build. https://bugs.llvm.org/show_bug.cgi?id=38886, where kernel builder will use -msoft-float , which is corresponding to -mattr=-hard-float.

And with this patch we will definitely lose performance when they build with -mcpu=power9.

An example would be:

$ cat ex.ll
@z = external local_unnamed_addr global i32*, align 8

define signext i32 @_Z2tcii(i32 signext %x, i32 signext %y){
entry:
  %0 = load i32*, i32** @z, align 8
  %add = add nsw i32 %y, %x
  %idxprom = sext i32 %add to i64
  %arrayidx = getelementptr inbounds i32, i32* %0, i64 %idxprom
  %1 = load i32, i32* %arrayidx, align 4
  ret i32 %1
}

The current code gen:

$ ~/llvm-git/build/bin/llc -mcpu=pwr9 -mattr=-hard-float < ex.ll|grep localentry -A10
        .localentry     _Z2tcii, .Lfunc_lep0-.Lfunc_gep0
# %bb.0:                                # %entry
        addis 5, 2, .LC0@toc@ha
        ld 5, .LC0@toc@l(5)
        ld 5, 0(5)
        add 3, 4, 3
        extswsli 3, 3, 2
        lwax 3, 5, 3
        blr
        .long   0
        .quad   0

With this patch:

$ ~/llvm-git/build/bin/llc -mcpu=pwr9 -mattr=-hard-float < ex.ll|grep localentry -A10
        .localentry     _Z2tcii, .Lfunc_lep0-.Lfunc_gep0
# %bb.0:                                # %entry
        addis 5, 2, .LC0@toc@ha
        ld 5, .LC0@toc@l(5)
        add 3, 4, 3
        extsw 3, 3
        sldi 3, 3, 2
        ld 5, 0(5)
        lwax 3, 5, 3
        blr
        .long   0

193

As an example, the following will produce FP instructions:

Good example, this should be a bug, we should have a look and fix it later.

193

So the addition of this implication is really to avoid having to add the predicate for every pattern we want to add going forward.

Understand, can we achieve the goal by using a new predicate, like:

def ISA3_0_and_FPU: Predicate<"PPCSubTarget->hasFPU() && PPCSubTarget->isISA3_0()">;

I've realized I forgot to attach a full context patch.
Updated with full context.

nemanjai added inline comments.Oct 2 2019, 7:14 PM

llvm/lib/Target/PowerPC/PPC.td
193	I stand corrected. That is the use case I requested. Amy @amyk please abandon this patch for now as this has negative implications for kernel builds.

Thanks @jsji and @nemanjai for your insights on this patch. I will abandon this for now.

Revision Contents

Path

Size

llvm/

lib/

Target/

PowerPC/

PPC.td

3 lines

test/

CodeGen/

PowerPC/

feature-isav30.ll

18 lines

Diff 222729

llvm/lib/Target/PowerPC/PPC.td

Show First 20 Lines • Show All 183 Lines • ▼ Show 20 Lines	def FeatureSlowPOPCNTD : SubtargetFeature<"slow-popcntd","HasPOPCNTD",
"POPCNTD_Slow",		"POPCNTD_Slow",
"Has slow popcnt[dw] instructions">;		"Has slow popcnt[dw] instructions">;

def DeprecatedDST : SubtargetFeature<"", "DeprecatedDST", "true",		def DeprecatedDST : SubtargetFeature<"", "DeprecatedDST", "true",
"Treat vector data stream cache control instructions as deprecated">;		"Treat vector data stream cache control instructions as deprecated">;

def FeatureISA3_0 : SubtargetFeature<"isa-v30-instructions", "IsISA3_0",		def FeatureISA3_0 : SubtargetFeature<"isa-v30-instructions", "IsISA3_0",
"true",		"true",
"Enable instructions added in ISA 3.0.">;		"Enable instructions added in ISA 3.0.",
		[FeatureFPU]>;
		jsjiUnsubmitted Not Done Reply Inline Actions Looks like there is some misunderstanding of what `FeatureISA3_0` is for? As in the description, `FeatureISA3_0` is for new instruction added in ISA 3.0, not all the instructions in ISA 3.0. Why it need to depend on FeatureFPU? In other words, with the new patch `-mattr=-fpu` will also remove `isa-v30-instructions` effectively. So that means, if you add -mattr=-fpu, you can't use any of the new instruction added for P8? Is that what you intended? jsji: Looks like there is some misunderstanding of what `FeatureISA3_0` is for? As in the…
		jsjiUnsubmitted Not Done Reply Inline Actions I meant `P9`. jsji: I meant `P9`.
		nemanjaiUnsubmitted Not Done Reply Inline Actions Yes. That is very much part of the intent here. It is not possible to have a CPU that conforms to ISA 3.0 that does not support HW floating point operations (i.e. doesn't have a HW FPU). Having an FPU does not imply we have an implementation of ISA 3.0. But having an implementation of ISA 3.0 absolutely implies having an FPU. nemanjai: Yes. That is very much part of the intent here. It is not possible to have a CPU that conforms…
		jsjiUnsubmitted Not Done Reply Inline Actions Yes, you are right, and I also agree that `having an implementation of ISA 3.0 absolutely implies having an FPU`. However, looks like to me that `FeatureISA3_0` is NOT corresponding to `compliant implementation of ISA 3.0`. But rather controlling whether we can generate new instructions in ISA 3.0. Also, `FeatureFPU` is also not corresponding to `having an FPU`, but rather controlling whether we can generates FPU instructions. So if we add the Implies to `FPU` here, we are more or less saying: all ISA 3.0 instructions are also FPU instructions, we can only generate ISA 3.0 instructions if we are allow to generate FPU instructions. Does this dependency really exists? What if someone would like to avoid generating FPU instructions (eg: kernel users), but still would like to take advantage of P9 new instructions like `extswsli`, `maddld`,`modud`. I believe he/she can use `-mcpu=pwr9 -mattr=-fpu` right now, but will get performance deg if we apply this patch. What do you think? jsji: Yes, you are right, and I also agree that ` having an implementation of ISA 3.0 absolutely…
		nemanjaiUnsubmitted Not Done Reply Inline Actions If we are aware of a valid use case for something like this (i.e. we want instructions that were added in ISA 3.0) but we want a guarantee that no floating point instructions are emitted, we need to do a bit more work than just stop this patch from going in. The reason we added this feature is to differentiate SPE codegen from codegen for CPUs that have an FPU. That is about all it is good for. As an example, the following will produce FP instructions: define dso_local signext i32 @test(double %a, double %b) local_unnamed_addr #0 { entry: %cmp = fcmp olt double %a, %b %conv = zext i1 %cmp to i32 ret i32 %conv } llc -mattr=-fpu < a.ll \| grep fcmpu fcmpu 0, 1, 2 So the addition of this implication is really to avoid having to add the predicate for every pattern we want to add going forward. nemanjai: If we are aware of a valid use case for something like this (i.e. we want instructions that…
		jsjiUnsubmitted Not Done Reply Inline Actions If we are aware of a valid use case for something like this (i.e. we want instructions that were added in ISA 3.0) but we want a guarantee that no floating point instructions are emitted I believe this is an example usage in Linux Kernel build. https://bugs.llvm.org/show_bug.cgi?id=38886, where kernel builder will use `-msoft-float` , which is corresponding to `-mattr=-hard-float`. And with this patch we will definitely lose performance when they build with -mcpu=power9. An example would be: $ cat ex.ll @z = external local_unnamed_addr global i32, align 8 define signext i32 @_Z2tcii(i32 signext %x, i32 signext %y){ entry: %0 = load i32, i32** @z, align 8 %add = add nsw i32 %y, %x %idxprom = sext i32 %add to i64 %arrayidx = getelementptr inbounds i32, i32* %0, i64 %idxprom %1 = load i32, i32* %arrayidx, align 4 ret i32 %1 } The current code gen: $ ~/llvm-git/build/bin/llc -mcpu=pwr9 -mattr=-hard-float < ex.ll\|grep localentry -A10 .localentry _Z2tcii, .Lfunc_lep0-.Lfunc_gep0 # %bb.0: # %entry addis 5, 2, .LC0@toc@ha ld 5, .LC0@toc@l(5) ld 5, 0(5) add 3, 4, 3 extswsli 3, 3, 2 lwax 3, 5, 3 blr .long 0 .quad 0 With this patch: $ ~/llvm-git/build/bin/llc -mcpu=pwr9 -mattr=-hard-float < ex.ll\|grep localentry -A10 .localentry _Z2tcii, .Lfunc_lep0-.Lfunc_gep0 # %bb.0: # %entry addis 5, 2, .LC0@toc@ha ld 5, .LC0@toc@l(5) add 3, 4, 3 extsw 3, 3 sldi 3, 3, 2 ld 5, 0(5) lwax 3, 5, 3 blr .long 0 jsji: > If we are aware of a valid use case for something like this (i.e. we want instructions that…
		nemanjaiUnsubmitted Not Done Reply Inline Actions I stand corrected. That is the use case I requested. Amy @amyk please abandon this patch for now as this has negative implications for kernel builds. nemanjai: I stand corrected. That is the use case I requested. Amy @amyk please abandon this patch for…
		jsjiUnsubmitted Not Done Reply Inline Actions As an example, the following will produce FP instructions: Good example, this should be a bug, we should have a look and fix it later. jsji: > As an example, the following will produce FP instructions: Good example, this should be a…
		jsjiUnsubmitted Not Done Reply Inline Actions So the addition of this implication is really to avoid having to add the predicate for every pattern we want to add going forward. Understand, can we achieve the goal by using a new predicate, like: def ISA3_0_and_FPU: Predicate<"PPCSubTarget->hasFPU() && PPCSubTarget->isISA3_0()">; jsji: > So the addition of this implication is really to avoid having to add the predicate for every…
def FeatureP9Altivec : SubtargetFeature<"power9-altivec", "HasP9Altivec", "true",		def FeatureP9Altivec : SubtargetFeature<"power9-altivec", "HasP9Altivec", "true",
"Enable POWER9 Altivec instructions",		"Enable POWER9 Altivec instructions",
[FeatureISA3_0, FeatureP8Altivec]>;		[FeatureISA3_0, FeatureP8Altivec]>;
def FeatureP9Vector : SubtargetFeature<"power9-vector", "HasP9Vector", "true",		def FeatureP9Vector : SubtargetFeature<"power9-vector", "HasP9Vector", "true",
"Enable POWER9 vector instructions",		"Enable POWER9 vector instructions",
[FeatureISA3_0, FeatureP8Vector,		[FeatureISA3_0, FeatureP8Vector,
FeatureP9Altivec]>;		FeatureP9Altivec]>;
// A separate feature for this even though it is equivalent to P9Vector		// A separate feature for this even though it is equivalent to P9Vector
▲ Show 20 Lines • Show All 298 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/feature-isav30.ll

This file was added.

				; RUN: not llc -mattr=+isa-v30-instructions,+spe -verify-machineinstrs \
				; RUN: -mtriple=powerpc-unknown-unknown %s -o - 2>&1 \| FileCheck %s

				; This IR aims to test how having ISA 3.0 implies that FPU is also enabled, as
				; a compliant implementation of ISA 3.0 should have the floating point unit.
				;
				; Thus, also setting SPE along side of having ISA 3.0 would not be correct as
				; both SPE and traditional floating point cannot be enabled.
				;
				; The triple is set to powerpc-unknown-unknown, as SPE is only supported on
				; 32-bit targets.

				define signext i32 @test_fpu_v30_feature () {
				; CHECK: LLVM ERROR: SPE and traditional floating point cannot both be enabled.
				entry:
				ret i32 0
				}

This is an archive of the discontinued LLVM Phabricator instance.

[NFC][PowerPC] Adding FeatureFPU in the definition of FeatureISA3_0 AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 222729

llvm/lib/Target/PowerPC/PPC.td

llvm/test/CodeGen/PowerPC/feature-isav30.ll

[NFC][PowerPC] Adding FeatureFPU in the definition of FeatureISA3_0
AbandonedPublic