This is an archive of the discontinued LLVM Phabricator instance.

[x86] convert logic-of-FP-compares to FP logic-of-vector-compares
ClosedPublic

Authored by spatel on Sep 23 2021, 9:35 AM.

Details

Summary

This is motivated by the examples and discussion in:
https://llvm.org/PR51245
...and related bugs.

By using vector compares and vector logic, we can convert 2 'set' instructions into 1 'movd' or 'movmsk' and generally improve throughput/reduce instructions.

Unfortunately, we don't have a complete vector compare ISA before AVX, so I left SSE-only out of this patch. Ie, we'd need extra logic ops to simulate the missing predicates for SSE 'cmpp*', so it's not as clearly a win.

Diff Detail

Event Timeline

spatel created this revision.Sep 23 2021, 9:35 AM
spatel requested review of this revision.Sep 23 2021, 9:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 23 2021, 9:35 AM
RKSimon added inline comments.Sep 23 2021, 10:10 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
45511

Why not use SCALAR_TO_VECTOR ?

llvm/test/CodeGen/X86/fcmp-logic.ll
238

These could still be merged (as long as we only use the lower 32-bits).?

RKSimon added inline comments.Sep 23 2021, 10:14 AM
llvm/test/CodeGen/X86/fcmp-logic.ll
2–4

We should probably add AVX512 coverage to ensure we're not doing anything that could be done better with kmask registers

spatel marked 3 inline comments as done.Sep 23 2021, 11:37 AM
spatel added inline comments.
llvm/lib/Target/X86/X86ISelLowering.cpp
45511

No reason - forgot we had that opcode.

llvm/test/CodeGen/X86/fcmp-logic.ll
238

Yes - we'd need to add a cast to make it work, but hopefully that would get folded away at some point.
I'll add a TODO for now.

spatel updated this revision to Diff 374638.Sep 23 2021, 11:42 AM
spatel marked 2 inline comments as done.

Patch updated:

  1. Use SCALAR_TO_VECTOR opcode to reduce code.
  2. Added avx512f to test to show k-reg codegen.
  3. Added TODO comment about enhancement for mismatched FP types.
RKSimon accepted this revision.Sep 23 2021, 1:07 PM

LGTM - cheers

I don't think this patch will cover it, but as a follow up please could you add test coverage for logic containing more than 2 fcmps?

llvm/lib/Target/X86/X86ISelLowering.cpp
45510

Do we need to check/assert that VT is MVT::i1 ?

This revision is now accepted and ready to land.Sep 23 2021, 1:07 PM
pengfei added inline comments.Sep 23 2021, 10:53 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
45452–45454

Nit, why use hyphen here but leave space above?

45510

I think it's Ok since the logic operands come from setcc directly. Do we need to handle (logic (zext (setcc ...?

pengfei added inline comments.Sep 23 2021, 10:56 PM
llvm/test/CodeGen/X86/fcmp-logic.ll
242–258

Should we add a common CHECK for them?

spatel marked 3 inline comments as done.Sep 24 2021, 5:17 AM
spatel added inline comments.
llvm/lib/Target/X86/X86ISelLowering.cpp
45452–45454

I just overlooked the first one - will fix it.

45510

I haven't been able to find an example where we have the pattern late only, but the result of x86 scalar setcc could be i8 after some transforms.

There's no difference on the existing tests if I add the check for i1, so I'll do that.

Also, the pattern where we have sext/zext between the logic op and setcc is already folded away by DAGCombiner (the cast is moved after the logic), so I don't think we need to worry about that case.

llvm/test/CodeGen/X86/fcmp-logic.ll
242–258

It's the "v" in the mnemonics that makes these different. Do we have a scrubber option in the script for that?

spatel updated this revision to Diff 374811.Sep 24 2021, 5:22 AM
spatel marked 2 inline comments as done.

Patch updated:

  1. Made "floating-point" consistent in the function comment.
  2. Added MVT::i1 constraint for safety.
pengfei accepted this revision.Sep 24 2021, 5:36 AM

LGTM.

llvm/test/CodeGen/X86/fcmp-logic.ll
242–258

Oh, yes. I don't think there's such an option. I did have the same thought. But I think it's not easy to achieve. It's valuable only when we check SSE and AVX instructions at the same time. Omitting 'v' unconditionally is over kill.

This revision was landed with ongoing or failed builds.Sep 24 2021, 8:38 AM
This revision was automatically updated to reflect the committed changes.