This is an archive of the discontinued LLVM Phabricator instance.

[X86] [AMX] Replace bitcast with specific AMX intrinsics with X86 specific cast.
ClosedPublic

Authored by yubing on Aug 5 2021, 2:30 AM.

Details

Summary

There is some discussion on the bitcast for vector and x86_amx at https://reviews.llvm.org/D99152. This patch is to introduce a x86 specific cast for vector and x86_amx, so that it can avoid some unnecessary optimization by middle-end. On the other way, we have to optimize the x86 specific cast by ourselves. This patch also optimize the cast operation to eliminate redundant code.

Diff Detail

Event Timeline

yubing created this revision.Aug 5 2021, 2:30 AM
yubing requested review of this revision.Aug 5 2021, 2:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 5 2021, 2:30 AM

Would you summarize the the title in fewer words and elaborate it in description and commit message?

LuoYuanke retitled this revision from [X86] [AMX] Replace bitcast with specific AMX intrinsics In this patch, we introduce new intrinsics for cast from vector to x86_amx and cast from x86_amx to vector. to [X86] [AMX] Replace bitcast with specific AMX intrinsics with X86 specific cast..Aug 7 2021, 2:41 AM
LuoYuanke edited the summary of this revision. (Show Details)
fhahn added a comment.Aug 13 2021, 1:44 AM

Could you also document those intrinsics in the LangRef?

llvm/lib/Target/X86/X86LowerAMXType.cpp
71

can you just return match() || match()?

Also it looks like you are not using Vec, so you could use m_Any() or something like that instead.

Could you also document those intrinsics in the LangRef?

This is X86 specific intrinsics. Should we document it in LangRef?

LuoYuanke added inline comments.Aug 16 2021, 7:50 AM
llvm/lib/Target/X86/X86LowerAMXType.cpp
800

Drop brace.

820

Drop brace.

822

Drop brace.

837

of

841

What does ACI stand for? AMX cast intrinsic?

857

Drop brace.

917

We can erase dead cast code from Vec2TileInsts and Vec2TileInsts and get AMX cast instruction from there, so that we can avoid iterate basic block again.

yubing added inline comments.Aug 16 2021, 7:57 AM
llvm/lib/Target/X86/X86LowerAMXType.cpp
71

Thanks for comments. You're right, I will change it.
We could use m_Value().

LuoYuanke added inline comments.Aug 16 2021, 11:19 PM
llvm/test/CodeGen/X86/AMX/lat-combine-amx-bitcast.ll
107

We can optimize udef or zero vector with tilezero. We may do it in another patch.

153

This may be optimized to "phi <x86_amx>", and cast back to vector for its user %evilphi2. But we can optimize it in another patch.

400

Could you simplify the name?

llvm/test/CodeGen/X86/AMX/lat-transform-amx-bitcast.ll
142

We can forward llvm.x86.tilestored64.internal to finial store (line 143). We can optimize it in another patch.

292

We can combine load and cast to @llvm.x86.tileloadd64.internal.

yubing updated this revision to Diff 366816.Aug 17 2021, 12:07 AM

Address flanhn and yuanke's comments

yubing marked 6 inline comments as done.Aug 17 2021, 12:07 AM
yubing added inline comments.
llvm/lib/Target/X86/X86LowerAMXType.cpp
841

Yeah, AMX cast intrinsic

917

We will refactor it in next patch

llvm/test/CodeGen/X86/AMX/lat-combine-amx-bitcast.ll
107

Sure

llvm/test/CodeGen/X86/AMX/lat-transform-amx-bitcast.ll
292

Sure, we will do it in the next patch.

LuoYuanke accepted this revision.Aug 17 2021, 12:46 AM

LGTM, thanks.

This revision is now accepted and ready to land.Aug 17 2021, 12:46 AM
This revision was landed with ongoing or failed builds.Aug 17 2021, 2:04 AM
This revision was automatically updated to reflect the committed changes.
fhahn added a comment.Aug 17 2021, 1:32 PM

This is X86 specific intrinsics. Should we document it in LangRef?

looks like target-specific ones are not documented there so far, not sure if there's a different place to document them.

This appears to be causing failures on EXPENSIVE_CHECKS builds: https://bugs.llvm.org/show_bug.cgi?id=51513

RKSimon added inline comments.Aug 17 2021, 4:14 PM
llvm/lib/Target/X86/X86LowerAMXType.cpp
1049

Doesn't C needs to be updated based on the results of combineAMXcast and transformAllAMXCast as well ?

LuoYuanke added inline comments.Aug 17 2021, 4:35 PM
llvm/lib/Target/X86/X86LowerAMXType.cpp
1049

Yes. C should be updated. Thanks, Simon.

pengfei added inline comments.Aug 17 2021, 7:31 PM
llvm/lib/Target/X86/X86LowerAMXType.cpp
178

This function has no user now?

pengfei added inline comments.Aug 17 2021, 7:35 PM
llvm/lib/Target/X86/X86LowerAMXType.cpp
915

PN has no user. Should be removed to avoid warning.

I created patch https://reviews.llvm.org/D108269 to fix the bug and address Pengfei's comments.

uabelho added inline comments.
llvm/lib/Target/X86/X86LowerAMXType.cpp
178

I agree, this function seems to be unused? I get the following warning compiling trunk with gcc:

../lib/Target/X86/X86LowerAMXType.cpp:181:8: warning: 'llvm::Value* {anonymous}::X86LowerAMXType::getRowFromCol(llvm::Instruction*, llvm::Value*, unsigned int)' defined but not used [-Wunused-function]
  181 | Value *X86LowerAMXType::getRowFromCol(Instruction *II, Value *V,
      |        ^~~~~~~~~~~~~~~