Page MenuHomePhabricator

[PowerPC] Exploit xxsplti32dx (constant materialization) for scalars
ClosedPublic

Authored by Conanap on Jan 26 2021, 10:35 AM.

Details

Summary

Previously related differential (exploit xxsplti32dx for vectors) here: https://reviews.llvm.org/D90173

This patch exploits the xxsplti32dx instruction available on Power10 in place of constant pool loads where xxspltidp would not be able to, usually because the immediate cannot fit into 32 bits.

Diff Detail

Unit TestsFailed

TimeTest
30 msx64 windows > LLVM.ExecutionEngine/JITLink/AArch64::MachO_arm64_ehframe.test
Script: -- : 'RUN: at line 1'; c:\ws\w64\llvm-project\premerge-checks\build\bin\llvm-jitlink.exe -noexec C:\ws\w64\llvm-project\premerge-checks\llvm\test\ExecutionEngine\JITLink\AArch64/Inputs/MachO_arm64_ehframe.o
50 msx64 windows > LLVM.ExecutionEngine/JITLink/AArch64::MachO_arm64_relocations.s
Script: -- : 'RUN: at line 1'; rm -rf C:\ws\w64\llvm-project\premerge-checks\build\test\ExecutionEngine\JITLink\AArch64\Output\MachO_arm64_relocations.s.tmp && mkdir -p C:\ws\w64\llvm-project\premerge-checks\build\test\ExecutionEngine\JITLink\AArch64\Output\MachO_arm64_relocations.s.tmp
60 msx64 windows > LLVM.ExecutionEngine/JITLink/X86::ELF_skip_debug_sections.s
Script: -- : 'RUN: at line 2'; c:\ws\w64\llvm-project\premerge-checks\build\bin\llvm-mc.exe -triple=x86_64-pc-linux-gnu -filetype=obj -o C:\ws\w64\llvm-project\premerge-checks\build\test\ExecutionEngine\JITLink\X86\Output\ELF_skip_debug_sections.s.tmp C:\ws\w64\llvm-project\premerge-checks\llvm\test\ExecutionEngine\JITLink\X86\ELF_skip_debug_sections.s
70 msx64 windows > LLVM.ExecutionEngine/JITLink/X86::ELF_weak_definitions.s
Script: -- : 'RUN: at line 1'; rm -rf C:\ws\w64\llvm-project\premerge-checks\build\test\ExecutionEngine\JITLink\X86\Output\ELF_weak_definitions.s.tmp && mkdir -p C:\ws\w64\llvm-project\premerge-checks\build\test\ExecutionEngine\JITLink\X86\Output\ELF_weak_definitions.s.tmp
60 msx64 windows > LLVM.ExecutionEngine/JITLink/X86::ELF_x86-64_common.s
Script: -- : 'RUN: at line 1'; rm -rf C:\ws\w64\llvm-project\premerge-checks\build\test\ExecutionEngine\JITLink\X86\Output\ELF_x86-64_common.s.tmp && mkdir -p C:\ws\w64\llvm-project\premerge-checks\build\test\ExecutionEngine\JITLink\X86\Output\ELF_x86-64_common.s.tmp
View Full Test Results (19 Failed)

Event Timeline

Conanap created this revision.Jan 26 2021, 10:35 AM
Conanap requested review of this revision.Jan 26 2021, 10:35 AM
Conanap updated this revision to Diff 321184.Feb 3 2021, 12:11 PM

Updated to ensure the shortcircuit protects against the destructive function.

stefanp requested changes to this revision.Feb 16 2021, 1:08 PM
stefanp added a subscriber: stefanp.
stefanp added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.h
1324

Is the APInt version of this function used anywere?

llvm/lib/Target/PowerPC/PPCInstrInfo.td
412

Why are we running this here?
We don't check the return of the function so we must assume that it returns true.
In that case the value of APFloatOfN won't change because convertToNonDenormSingle will only change the value of the parameter if it returns true. But checkNonDenormCannotConvertToSingle only returns true if convertToNonDenormSingle return false.

llvm/test/CodeGen/PowerPC/constant-pool.ll
48

I'm looking to understand this test case.
We are trying to materialize a special PowerPC long double (double-double). It seems that we have materialized one half of it and not the other half.

Is it because the first half is a denormal?
Why are we avoiding denormals anyway? It seems like we can completely specify a 64 bit double with two xxsplti32dx instructions.

369

What is going on here?
It almost looks like we are spilling vs3 half way through materializing a constant.

This revision now requires changes to proceed.Feb 16 2021, 1:08 PM
amyk added a subscriber: amyk.Feb 16 2021, 2:02 PM

In addition to the nit comments, I also have the same question as Stefan for getFPAs64BitIntHi/getFPAs64BitIntLo.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8590

nit: end with period.

llvm/lib/Target/PowerPC/PPCInstrPrefix.td
2616
2618

Have this : $A on the line above? Same as the one below.
Also a minor nit, but add a space to separate the (.

Conanap updated this revision to Diff 326000.Feb 24 2021, 12:09 AM
Conanap marked 6 inline comments as done.

Addressed some nits and a problem where sometimes the compiler would spill half way through materialization.

llvm/lib/Target/PowerPC/PPCISelLowering.h
1324

Hm I don't think so, although I implemented it for consistency with XXSPLTIDP (convertToNonDenormSingle). I'll remove this if that is preferred.

llvm/test/CodeGen/PowerPC/constant-pool.ll
369

This is fixed now, thank you for spotting this.

llvm/test/CodeGen/PowerPC/pcrel-linkeropt.ll
153–154

Just a heads up - the tests in this file are autogenerated, hence some unrelated changes.

Comments relate to just cleaning up the patch a little.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8592

I'm wondering if it would not be better to just inline this. It's just "not" of another call. That would simplify the patch a little.

15885

Isn't this just :

return convertToNonDenormSingle(APFloatOfImm) ||
       !convertToNonDenormSingle(APFloatOfImm);

Which is always true?

Basically the logic is that we can now materialize without a load any f32 or any f64.

llvm/lib/Target/PowerPC/PPCISelLowering.h
1324

nit:
Ok, unless other reviewers disagree, just remove it.

Conanap updated this revision to Diff 327537.Mar 2 2021, 12:11 PM

Addressed Stefan's comments, converted the check to a mirror of the original function for XXSPLTIDP except non-destructive.

Conanap marked 2 inline comments as done.Mar 2 2021, 12:13 PM
Conanap added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8592

so in the process of removing this, I thought I might as well just write a non-destructive test in its place. The tests didn't have any problems with just checkConvertToNonDenormSingle but might as well be on the safer side of things.

Conanap marked 3 inline comments as done.Mar 2 2021, 12:13 PM
amyk added inline comments.Mar 5 2021, 9:38 AM
llvm/lib/Target/PowerPC/PPCInstrPrefix.td
1885

I think it might be good to add a comment of why the XXSPLTI32DX instruction needs to be split out like this from the other instructions.

llvm/test/CodeGen/PowerPC/pcrel-linkeropt.ll
41

nit: Remove duplicate comment.

63

nit: Remove duplicate comment.

Conanap updated this revision to Diff 329015.Mar 8 2021, 7:47 AM
Conanap marked 3 inline comments as done.

Updated some comments.

amyk added inline comments.Mar 10 2021, 12:16 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
15884–15885

Minor nit on comment.

stefanp accepted this revision.Mar 12 2021, 4:24 AM

Thank you for adding this!
Other than one minor nit I think this LGTM.

Feel free to address nits on commits.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8593

nit:
You can just inline this. It is only used in one place.

This revision is now accepted and ready to land.Mar 12 2021, 4:24 AM