This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Mark -1 as cheap in xor's for thumb1
ClosedPublic

Authored by dmgreen on Feb 19 2018, 1:45 AM.

Details

Summary

We can always convert xor %a, -1 into MVN, even in thumb 1 where the -1
would not otherwise be considered a cheap constant. This prevents the
-1's from being pulled out into constants and potentially hoisted.

The loop_2 tests are the only ones that change here.

Diff Detail

Repository
rL LLVM

Event Timeline

dmgreen created this revision.Feb 19 2018, 1:45 AM

This is the same motivating case as D42951 - abandon that patch?

I haven't looked at ConstantHoisting much, but this seems like a good solution to the problem. I wonder if we could generalize this for all targets by checking if a 'not' op is available.

Might want to commit the tests first with auto-generated output using utils/update_llc_test_checks.py, so we just see the improvement here.

There's a FIXME note about global isel for the div-by-constant case, so this problem also disappears once we have that? Maybe the whole thing disappears, so it's not worth repeating the comment.

This is the same motivating case as D42951 - abandon that patch?

Yep. This is part one out of two or three. I'm not sure yet, depends on how part 2 looks. I don't think it will solve the register pressure part of the problem (not entirely at least), but it should always be a net gain when it comes up.

I haven't looked at ConstantHoisting much, but this seems like a good solution to the problem. I wonder if we could generalize this for all targets by checking if a 'not' op is available.

Hmm. I didn't even consider other targets. I think the problem is getIntImmCost above (without an opcode) returning 2 for thumb1 targets. Everything else (Thumb2/Arm) is going to treat -1 as a cheap constant anyway. For other targets, I believe the default is to return Free for all constants.

Might want to commit the tests first with auto-generated output using utils/update_llc_test_checks.py, so we just see the improvement here.

Sure, sounds sensible. Like I said it's just loop_2 that's updated here.

There's a FIXME note about global isel for the div-by-constant case, so this problem also disappears once we have that? Maybe the whole thing disappears, so it's not worth repeating the comment.

Yeah I think so. I presume global ISel will have a much better time looking across basic blocks and a lot of codegen prepare/constant merge wont be needed any more. I think that comment applies to all costs in this function.

dmgreen updated this revision to Diff 134945.Feb 19 2018, 10:32 AM

I realised I could just show the updates here without having to commit the changes first. So that's what this is.

I've not committed the left hand side yet, let me know if I should anyway. Otherwise this should show the differences.

spatel accepted this revision.Feb 19 2018, 11:56 AM

LGTM.

This revision is now accepted and ready to land.Feb 19 2018, 11:56 AM
This revision was automatically updated to reflect the committed changes.
efriedma added inline comments.
llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.cpp
131

getSExtValue() will crash if Imm doesn't fit into an int64_t. Maybe use isAllOnesValue()?

dmgreen added inline comments.Feb 21 2018, 2:13 AM
llvm/trunk/lib/Target/ARM/ARMTargetTransformInfo.cpp
131

Ah. Where it's not -1. Of course. Sorry about that, thanks for the catch.

https://reviews.llvm.org/D43549