This is an archive of the discontinued LLVM Phabricator instance.

[HardwareLoops] Loop counter guard Loop intrinsic
ClosedPublic

Authored by samparker on Jun 26 2019, 1:15 AM.

Details

Summary

Introduce llvm.test.set.loop.iterations which sets the loop counter and also produces an i1 after testing that the count is not zero.

Diff Detail

Event Timeline

samparker created this revision.Jun 26 2019, 1:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 26 2019, 1:15 AM

Just a first look, some questions inline.

include/llvm/IR/Intrinsics.td
1195

Perhaps it's blindingly obvious, but maybe a comment that this is useful for while-do hardware kind of loops?

lib/CodeGen/HardwareLoops.cpp
269

Is there always a preheader? Can we dereference a nullptr here?

276

perhaps silly question, I guess the assumption here is that is if this is a conditional then it must be the loop check, but is that always true?

samparker marked 2 inline comments as done.Jun 26 2019, 11:48 PM
samparker added inline comments.
lib/CodeGen/HardwareLoops.cpp
269

Just before landing here we create a preheader if one doesn't exist.

276

Well it certainly is controlling whether the loop is entered or not... I had the same thought tbh but I reasoned that any other condition of entering a loop would be handled before the iteration condition - i will add some tests along these lines though.

samparker marked an inline comment as done.Jun 27 2019, 1:32 AM
samparker added inline comments.
lib/CodeGen/HardwareLoops.cpp
276

My assumption was totally wrong :) so I'll revert to what I did originally: check for an icmp ne 0 so that the semantics don't change. I think we should be able to handle other icmp later by transforming them, if it's profitable.

SjoerdMeijer added inline comments.Jun 27 2019, 1:41 AM
lib/CodeGen/HardwareLoops.cpp
276

cheers, sounds like a good plan

samparker updated this revision to Diff 206815.Jun 27 2019, 3:54 AM

Added a check that we're replacing the conditional with an equivalent statement... The intrinsic tests for a non-zero value for the trip count, so we now ensure that the loop entry is guarded by icmp ne %trip.count, 0. This has meant the code has been moved around a bit from the previous patch.

SjoerdMeijer added inline comments.Jun 27 2019, 5:25 AM
lib/CodeGen/HardwareLoops.cpp
307

A nit, and more a style preference, so please ignore if you disagree!
The first thing in initLoopCount is a very big lambda definition. But I think CanGenerateTest would be a good candidate for a local helper function as it requires very few arguments and only returns a bool, so that InitLoopCount becomes smaller and easier to read.

326

Another nit: I think this could be nice candidate for a lambda to share code, i.e. the checks are the same in the if-clause and else-if clause, just the indexes are swapped.

352

I am now actually wondering if we are checking here if the loop is particular kind of loop, i.e. a Single-Entry Single-Exit loop? Or, do we e.g. get this 'guarantee' from SCEV, because otherwise we can't calculate an iteration count? But what happens here if there are multiple predecessors?

samparker marked an inline comment as done.Jun 27 2019, 5:46 AM
samparker added inline comments.
lib/CodeGen/HardwareLoops.cpp
352

There can be multiple exits, but the search for the given exit block is quite odd in my opinion... the exit block that forms the loop has to be executed on every iteration - dominating all other exit blocks. The trip count is given for that exit block.

Multiple predecessors of what...? For the loop preheader, we just don't try to guard the entry and place the normal set.loop.iteration counter in the preheader. I'm wondering whether we could actually use the test.set.loop.iterations in the case of multiple preheader predecessors by placing the intrinsic in all those predecessors.... Something to look at later.

samparker updated this revision to Diff 206840.Jun 27 2019, 5:49 AM
  • CanGenerateTest is its own function - also containing a lambda to check the icmp.
  • Also using isLoopGuardedByCond to try to avoid the loop count being expanded before its required.

Last question I guess, I know it defaults to false, and hope I didn't miss it, but perhaps good to have a test too that explicitly sets -force-hardware-loop-guard=false?

lib/CodeGen/HardwareLoops.cpp
352

Ok, got it, thanks!

samparker updated this revision to Diff 206849.Jun 27 2019, 6:35 AM
  • Added a test line for -force-hardware-loop-guard=false.
  • Also added a test line for -force-hardware-loop-phi=true for some extra coverage.
SjoerdMeijer accepted this revision.Jun 27 2019, 6:47 AM

Looks good to me!
Perhaps wait a day with committing just in case someone has more comments.

This revision is now accepted and ready to land.Jun 27 2019, 6:47 AM
This revision was automatically updated to reflect the committed changes.