Page MenuHomePhabricator

[X86][LLVM][Canonical Compare Intrinsics] Creating a canonical representation for X86 CMP intrinsics
AbandonedPublic

Authored by m_zuckerman on Mar 27 2017, 6:55 AM.

Details

Summary

This patch adds new canonical representation to the instcombineCall pass.
This new functionality will switch the Compare intrinsic's operands and
will create one canonical representation to the compare intrinsics:
CMP (A, CONST)

Diff Detail

Event Timeline

m_zuckerman created this revision.Mar 27 2017, 6:55 AM
m_zuckerman edited the summary of this revision. (Show Details)
m_zuckerman retitled this revision from [X86][Canonical Compare Intrinsics] Creating a canonical representation for X86 CMP intrinsics to [X86][LLVM][Canonical Compare Intrinsics] Creating a canonical representation for X86 CMP intrinsics.
craig.topper added inline comments.Mar 27 2017, 10:57 AM
lib/Transforms/InstCombine/InstCombineCalls.cpp
1563

How are you ensuring the VEX encoding is valid?

1569

Unfortunately, there's nothing in the backend IR parsing that guarantees that only a constant for the last intrinsic argument can get here. If you write a bad IR file you can fail this cast. Use a dyn_cast and check it defensively. A bad intrinsic will fail isel later and throw a graceful error, but a bad value here would just cause a crash.

1592

Why are you using ConstantExpr::getIntegerValue? You should be able to use ConstantInt::get right?

2392

What about sse_cmp_ps, sse2_cmp_pd and their avx2 equivalents?

2505

These don't use an i32 for the comparison type immediate, but the X86CreateCanonicalCMP assumes they do when it creates the new Constant.

test/Transforms/InstCombine/X86CanonicCmp.ll
2

Add test cases for sse_cmp_ss and sse2_cmp_sd.

craig.topper edited edge metadata.Mar 27 2017, 11:00 AM

Why are we doing this in InstCombine instead of DAG combine? This won't enable any additional InstCombine optimizations.

m_zuckerman marked 2 inline comments as done.
m_zuckerman added a comment.EditedMar 30 2017, 12:30 PM

Why are we doing this in InstCombine instead of DAG combine? This won't enable any additional InstCombine optimizations.

First, We are doing that because we are working on intrinsics and I don't do any special changing. I only reorder the operands.

Secondly, we can do it in the lowering but we prefer that this canonical representation will be in earlier step.

lib/Transforms/InstCombine/InstCombineCalls.cpp
2392

As I wrote in the comments. We are only working on AVX intrinsics.

2505

In the main function, we only accept AVX and above intrinsics. This ensures us that we only work with i32 operands.

m_zuckerman marked an inline comment as done.Mar 30 2017, 12:54 PM
m_zuckerman added inline comments.
lib/Transforms/InstCombine/InstCombineCalls.cpp
1563

I added assert check that intrinsics contains avx.

filcab added a subscriber: filcab.Apr 3 2017, 8:23 AM
filcab added inline comments.
lib/Transforms/InstCombine/InstCombineCalls.cpp
1590

Spaces after commas, not before. Please be consistent in capitalization and hyphenation.

test/Transforms/InstCombine/X86CanonicCmp.ll
6

Typo in the function name (it could also be a bit more specific).
Please explain what a canonical compare is in the comment.
Or simply something like:

Transform compare(constant, variable, comparison, ...) into compare(variable, constant, flip(comparison), ...)
49

What's happening here?
Can you get more meaningful names and maybe do a log2 and chain?

m_zuckerman updated this revision to Diff 94182.Apr 5 2017, 3:10 AM
m_zuckerman marked 2 inline comments as done.
m_zuckerman marked an inline comment as done.
m_zuckerman updated this revision to Diff 94196.Apr 5 2017, 4:35 AM
m_zuckerman updated this revision to Diff 94197.

What is the plan for supporting the SSE intrinsics?

What does this canonicalization enable if we can't properly do it for the SSE intrinsics? Are we getting worse codegen for the scalar and 128-bit intrinsics on AVX targets just because we can't know we're an AVX target in InstCombine?

lib/Transforms/InstCombine/InstCombineCalls.cpp
1570

IntrinsicName is unused in release builds and will throw a warning. Probably need to wrap it in #ifndef NDEBUG

1578

You need to check for nullptr have a dyn_cast.

2402

What about x86_avx_cmp_ps_256 and x86_avx_cmp_pd_256?

zvi edited edge metadata.Apr 19 2017, 4:31 AM

What is the plan for supporting the SSE intrinsics?

What does this canonicalization enable if we can't properly do it for the SSE intrinsics? Are we getting worse codegen for the scalar and 128-bit intrinsics on AVX targets just because we can't know we're an AVX target in InstCombine?

IIUC, the issue here is that @llvm.x86.sse.cmp.* instrinsics function calls can be lowered to either SSE encoded instructions or to AVX instructions. The difference in not only in the encoding, but also in the possible predicates the instructions support.
The SSE variants support 8 predicates and the AVX variants support a richer set of 32 variants.
The concern is that at InstCombine-time we don't have knowledge about what subtarget features will be enabled so we can't replace an SSE immediate with an AVX-only immediate because if we end-up lowering to an SSE target, we don't expect (or at least now it doesn't) the backend to reverse back to an SSE-legal form.
But what if the function contains "target-cpu" or "target-features" attributes which will allow us to assume these will be used by the backend? would it be ok then to perform the canonicalization?

m_zuckerman abandoned this revision.Oct 7 2017, 11:41 PM